Sorting 1

Introduction to Sorting
What is Sorting?
Sorting: an operation that segregates items into groups according to specified criterion.
A={3162134590}
A={0112334569}
Why Sort and Examples Consider:

Sorting Books in Library (Dewey system) Sorting Individuals by Height (Feet and Inches) Sorting Movies in Blockbuster (Alphabetical) Sorting Numbers (Sequential)
Types of Sorting Algorithms

There are many, many different types of sorting algorithms, but the primary ones are: Bubble Sort Selection Sort Insertion Sort Merge Sort Shell Sort
Quick Sort Heap Sort Bucket Sort Radix Sort Swap Sort
Review of Complexity
Most of the primary sorting algorithms run on different space and time complexity.
Time Complexity is defined to be the time the computer takes to run a program (or algorithm in our case).
Space complexity is defined to be the amount of memory the computer needs to run a program.
Complexity
Complexity in general, measures the algorithms efficiency in internal factors such as the time needed to run an algorithm. External Factors (not related to complexity): Size of the input of the algorithm Speed of the Computer Quality of the Compiler
O(n), (n), & (n)

An algorithm or function T(n) is O(f(n)) whenever T(n)'s rate of growth is less than or equal to f(n)'s rate.
An algorithm or function T(n) is (f(n)) whenever T(n)'s rate of growth is greater than or equal to f(n)'s rate.
An algorithm or function T(n) is (f(n)) if and only if the rate of growth of T(n) is equal to f(n).
Common Big-Ohs
Time complexity constant O(1) O(log N) log linear O(N ) O(N log N) n-log-n quadratic O(N 2) cubic O(N 3) Example Adding to the front of a linked list Finding an entry in a sorted array Finding an entry in an unsorted array Sorting n items by divide-and-conquer Shortest path between two nodes in a graph Simultaneous linear equations
1 5 8 9 21 22 50
Front (Binary) (Linear) Finding 8:
Initial: 1 0 Final: 6 3
Big-Oh to Primary Sorts

Bubble Sort Selection Sort Insertion Sort Merge Sort Quick Sort

= n = n = n = n log(n) = n log(n)
Better ones?
Time Efficiency
How do we improve the time efficiency of a program?
The 90/10 Rule 90% of the execution time of a program is spent in executing 10% of the code So, how do we locate the critical 10%? software metrics tools global counters to locate bottlenecks (loop executions, function calls)
Time Efficiency Improvements

Possibilities (some better than others!) Move code out of loops that does not belong there (just good programming!) Remove any unnecessary I/O operations (I/O operations are expensive time-wise) Code so that the compiled code is more efficient Moral - Choose the most appropriate algorithm(s) BEFORE program implementation
Stable sort algorithms

Ann 98 Bob 90 Ann 98 Joe
A stable sort keeps equal elements in the same order This may matter when you are sorting data according to some characteristic Example: sorting students by test scores
98
Dan 75
Joe Pat
Bob 90
Sam 90 Pat Ze
98 86
86 86
Sam 90 Ze
86
Dan 75
original array
stably sorted
Unstable sort algorithms
An unstable sort may or may not keep equal elements in the same order Stability is usually not important, but sometimes it is important
Ann 98
Bob 90 Dan 75 Joe Pat
Joe
98
Ann 98 Bob 90 Sam 90 Ze Pat
98 86
86 86
Sam 90 Ze
86
Dan 75
original array
unstably sorted
Selection Sorting
Step: 1. select the smallest element among data[i]~ data[data.length-1]; 2. swap it with data[i]; 20 8 5 10 7 3. if not finishing, repeat 1&2
5 5 8 20 10 7 7 20 10 8
5 7 8 10 20
5 7 8 10 20
Pseudo-code for Insertion Sorting
Place ith item in proper position:

temp = data[i] shift those elements data[j] which greater than temp to right by one position place temp in its proper position
Insert Action: i=1

temp 8 8 20 8 5 10 7 i = 1, first iteration
20 20 5 10 7
---
8 20 5 10 7
Insert Action: i=2

temp 5 5 5 --8 20 5 10 7 8 20 20 10 7 8 5 8 20 10 7 8 20 10 7 i = 2, second iteration
Insert Action: i=3

temp 10 5 8 20 10 7 i = 3, third iteration
10
---
5 8 20 20 7
5 8 10 20 7
Insert Action: i=4

temp 7 5 8 10 20 7 i = 4, forth iteration
7
7 7
5 8 10 20 20
5 5 8 10 10 20 8 8 10 20
---
5 7 8 10 20
Insertion Sort
while some elements unsorted:
Using linear search, find the location in the sorted portion where the 1st element of the unsorted portion should be inserted Move all the elements after the insertion location up one position to make space for the new element
45
38 45 60 60 66 45 66 79 47 13 74 36 21 94 22 57 16 29 81
the fourth iteration of this loop is shown here
An insertion sort partitions the array into two regions
An insertion sort of an array of five integers
Insertion Sort Algorithm

public void insertionSort(Comparable[] arr) { for (int i = 1; i < arr.length; ++i) { Comparable temp = arr[i]; int pos = i; // Shuffle up all sorted items > arr[i] while (pos > 0 && arr[pos-1].compareTo(temp) > 0) { arr[pos] = arr[pos1]; pos--; } // end while // Insert the current item arr[pos] = temp; } }
Insertion Sort Analysis

public void insertionSort(Comparable[] arr) { for (int i = 1; i < arr.length; ++i) { outer Comparable temp = arr[i]; loop int pos = i; outer // Shuffle up all sorted items times > arr[i] while (pos > 0 && arr[pos-1].compareTo(temp) > 0) { arr[pos] = arr[pos1]; pos--; inner } // end while loop // Insert the current item inner arr[pos] = temp; times } }
Insertion Sort: Number of Comparisons

# of Sorted Elements
0 1
Best case
0 1
Worst case
0 1
n-1
1
n-1
n-1
n(n-1)/2
Remark: we only count comparisons of elements in the array.
Insertion Sort: Cost Function
1 operation to initialize the outer loop The outer loop is evaluated n-1 times

5 instructions (including outer loop comparison and increment) Total cost of the outer loop: 5(n-1)
How many times the inner loop is evaluated is affected by the state of the array to be sorted Best case: the array is already completely sorted so no shifting of array elements is required.
We only test the condition of the inner loop once (2 operations = 1 comparison + 1 element comparison), and the body is never executed Requires 2(n-1) operations.
Insertion Sort: Cost Function
Worst case: the array is sorted in reverse order (so each item has to be moved to the front of the array)

In the i-th iteration of the outer loop, the inner loop will perform 4i+1 operations Therefore, the total cost of the inner loop will be 2n(n-1)+n-1
Time cost:

Best case: 7(n-1) Worst case: 5(n-1)+2n(n-1)+n-1
What about the number of moves?

Best case: 2(n-1) moves Worst case: 2(n-1)+n(n-1)/2
Insertion Sort: Average Case

Is it closer to the best case (n comparisons)? The worst case (n * (n-1) / 2) comparisons? It turns out that when random data is sorted, insertion sort is usually closer to the worst case

Around n * (n-1) / 4 comparisons Calculating the average number of comparisons more exactly would require us to state assumptions about what the average input data set looked like This would, for example, necessitate discussion of how items were distributed over the array
Exact calculation of the number of operations required to perform even simple algorithms can be challenging (for instance, assume that each initial order of elements has the same probability to occur)
Bubble Sort

Simplest sorting algorithm Idea:

1. Set flag = false 2. Traverse the array and compare pairs of two consecutive elements

1.1 If E1 E2 -> OK (do nothing) 1.2 If E1 > E2 then Swap(E1, E2) and set flag = true
3. repeat 1. and 2. while flag=true.
Bubble Sort
1 23 2 56 9 8 10 100 2 1 2 23 56 9 8 10 100 3 1 2 23 9 56 8 10 100 4 1 2 23 9 8 56 10 100 5 1 2 23 9 8 10 56 100 ---- finish the first traversal ---1 1 2 23 9 8 10 56 100 2 1 2 9 23 8 10 56 100 3 1 2 9 8 23 10 56 100 4 1 2 9 8 10 23 56 100 ---- finish the second traversal ---
1
Bubble Sort
public void bubbleSort (Comparable[] arr) { boolean isSorted = false; while (!isSorted) { isSorted = true; for (i = 0; i<arr.length-1; i++) if (arr[i].compareTo(arr[i+1]) > 0) { Comparable tmp = arr[i]; arr[i] = arr[i+1]; arr[i+1] = tmp; isSorted = false; } } }
Bubble Sort: analysis
After the first traversal (iteration of the main loop) the maximum element is moved to its place (the end of array) After the i-th traversal largest i elements are in their places
Time cost, number of comparisons, number of moves -> Assignment 4
O Notation
O-notation Introduction
Exact counting of operations is often difficult (and tedious), even for simple algorithms
Often, exact counts are not useful due to other factors, e.g. the language/machine used, or the implementation of the algorithm (different types of operations do not take the same time anyway) O-notation is a mathematical language for evaluating the running-time (and memory usage) of algorithms
Growth Rate of an Algorithm
We often want to compare the performance of algorithms

When doing so we generally want to know how they perform when the problem size (n) is large Since cost functions are complex, and may be difficult to compute, we approximate them using O notation
Example of a Cost Function
Cost Function: tA(n) = n2 + 20n + 100
Which term dominates?
It depends on the size of n

n = 2, tA(n) = 4 + 40 + 100 The constant, 100, is the dominating term n = 10, tA(n) = 100 + 200 + 100 20n is the dominating term n = 100, tA(n) = 10,000 + 2,000 + 100 2 n is the dominating term n = 1000, tA(n) = 1,000,000 + 20,000 + 100 2 n is the dominating term
Big O Notation
O notation approximates the cost function of an algorithm
The approximation is usually good enough, especially when considering the efficiency of algorithm as n gets very large Allows us to estimate rate of function growth
Instead of computing the entire cost function we only need to count the number of times that an algorithm executes its barometer instruction(s)
The instruction that is executed the most number of times in an algorithm (the highest order term)
Big O Notation
Given functions tA(n) and g(n), we can say that the efficiency of an algorithm is of order g(n) if there are positive constants c and m such that
tA(n) c.g(n) for all n m tA(n) is O(g(n)) and we say that tA(n) is of order g(n)
we write

e.g. if an algorithms running time is 3n + 12 then the algorithm is O(n). If c is 3 and m is 12 then:
4 * n 3n + 12 for all n 12
In English

The cost function of an algorithm A, tA(n), can be approximated by another, simpler, function g(n) which is also a function with only 1 variable, the data size n. The function g(n) is selected such that it represents an upper bound on the efficiency of the algorithm A (i.e. an upper bound on the value of tA(n)). This is expressed using the big-O notation: O(g(n)). For example, if we consider the time efficiency of algorithm A then tA(n) is O(g(n)) would mean that

A cannot take more time than O(g(n)) to execute or that (more than c.g(n) for some constant c) the cost function tA(n) grows at most as fast as g(n)
The general idea is
when using Big-O notation, rather than giving a precise figure of the cost function using a specific data size n
express the behaviour of the algorithm as its data size n grows very large
so ignore

lower order terms and constants
O Notation Examples
All these expressions are O(n):
n, 3n, 61n + 5, 22n 5,
All these expressions are O(n2):
n2, 9 n2, 18 n2+ 4n 53,
All these expressions are O(n log n):
n(log n), 5n(log 99n), 18 + (4n 2)(log (5n + 3)),

Sorting 1

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Sorting 1

Загружено:

Авторское право:

Доступные форматы

Introduction to Sorting

Why Sort and Examples Consider:

Types of Sorting Algorithms

O(n), (n), & (n)

Front (Binary) (Linear) Finding 8:

Big-Oh to Primary Sorts

Time Efficiency Improvements

Stable sort algorithms

Unstable sort algorithms

Ann 98 Bob 90 Sam 90 Ze Pat

Pseudo-code for Insertion Sorting

Place ith item in proper position:

Insert Action: i=1

Insert Action: i=2

Insert Action: i=3

Insert Action: i=4

while some elements unsorted:

the fourth iteration of this loop is shown here

An insertion sort partitions the array into two regions

An insertion sort of an array of five integers

Insertion Sort Algorithm

Insertion Sort Analysis

Insertion Sort: Number of Comparisons

Remark: we only count comparisons of elements in the array.

Insertion Sort: Cost Function

Insertion Sort: Cost Function

Best case: 7(n-1) Worst case: 5(n-1)+2n(n-1)+n-1

What about the number of moves?

Best case: 2(n-1) moves Worst case: 2(n-1)+n(n-1)/2

Insertion Sort: Average Case

Simplest sorting algorithm Idea:

3. repeat 1. and 2. while flag=true.

Bubble Sort: analysis

Growth Rate of an Algorithm

We often want to compare the performance of algorithms

Example of a Cost Function

Cost Function: tA(n) = n2 + 20n + 100

Which term dominates?

It depends on the size of n

O notation approximates the cost function of an algorithm

The general idea is

lower order terms and constants

All these expressions are O(n):

n, 3n, 61n + 5, 22n 5,

All these expressions are O(n2):

n2, 9 n2, 18 n2+ 4n 53,

All these expressions are O(n log n):

n(log n), 5n(log 99n), 18 + (4n 2)(log (5n + 3)),

Вам также может понравиться