Вы находитесь на странице: 1из 21

Chapter 3

Searching and sorting

3.1 Introduction
Last year we looked at some simple algorithms for sorting and searching. We we start
by revising some of these and having a look at how we can measure the performance of
algorithms. We will then look at some more sophisticated and faster algorithms. These
will use the programming technique known as recursion where a method can call itself.
Sorting the elements of an array into order is a very important task. It is also the first
task where we need to think seriously about the algorithm that we use. There are many
ways of sorting and some are much faster than others. If we are sorting 10 elements into
order it does not matter what algorithm we use, any correct method will get the result
quickly for us. It we are sorting a million elements the choice of algorithm can make the
difference between a sort that takes less than a second and one that takes several hours.
There is no one best sorting algorithm. Some algorithms are good when all the data
fits in memory, others are better when we have so much data that it needs to be stored in
a disk file. Some algorithms are good for data that is nearly in order at the start, other
algorithms are very bad in this case.
Before we start to sort we need to decide which order to use. For numerical types we
will sort into ascending or descending order. For strings we usually use alphabetical order,
but we need to decide what to do with capital letters and with characters that are not
letters. Sorting lists of names is quite complicated, we need to decide what to do with Mac
and Mc and O , do we sort on surname before initials?
For any sort we need an expression that says whether one element comes before another
in the list.

3.2 Simple sorts


Last year we looked a three sorts, the bubble sort, the selection sort and the insertion sort.
We will first look at two of these again.

1
CHAPTER 3. SEARCHING AND SORTING 2

3.2.1 Swapping
For many sorts we need to swap the position of two elements in an array. This is faster than
moving all the elements up or down to insert an element. To swap the i and j positions
in an array we cannot just do

a [ i ]= a [ j ];
a [ j ]= a [ i ];
as this overwrites the a[i] value before we can use it. We need to do

temp = a [ i ];
a [ i ]= a [ j ];
a [ j ]= temp ;
where temp is a temporary variable of the correct type.

3.2.2 The selection sort


In the selection sort we search the array for the element that goes first (smallest or largest
depending on the order). We then swap this to the front. Now search the rest of the
array for the next element and swap this to the next place. Repeat until there is only one
element left.
In our program elements below i have already been found. min holds the index of the
smallest element found so far. j holds the next element in our search.
This example sorts an array containing 3,1,4,2. The smallest element left is in bold.
Elements that we have finished sorting are in italic.

0 3 1 1 1
1 1 3 2 2
2 4 4 4 3
3 2 2 3 4

s t a t i c void selectionSort ( double [] array )


{
i n t min ;
double temp ;
f o r ( i n t i = 0; i < array . length - 1; i ++)
{
min = i ;
f o r ( i n t j = i + 1; j < array . length ; j ++)
i f ( array [ j ] < array [ min ])
min = j ;
CHAPTER 3. SEARCHING AND SORTING 3

temp = array [ min ];


array [ min ] = array [ i ];
array [ i ] = temp ;
}
}
Notice it is easy to convert this to sort arrays of any primitive type, for arrays of int
just change double to int throughout.

3.2.3 The insertion sort


In the insertion sort we take an element and insert it into part of the array that has already
been sorted. In order to do this we must move all the elements larger than the one we
want to insert up one.
Notice that to move elements up in an array we need to work from the top down so
that we do not overwrite each value before we use it.
The insertion sort just moves up the array starting from element 1. It inserts each
element in the correct place among the earlier elements.
In our program elements below i have already been sorted. key holds the value we are
trying to insert. pos runs down until it reaches the insertion position.
This example sorts an array containing 3,1,4,2. The element to insert is in bold. The
part already in order is in italic.

0 3 1 1 1
1 1 3 3 2
2 4 4 4 3
3 2 2 2 4

s t a t i c void insertionSort ( double [] array )


{
f o r ( i n t i = 1; i < array . length ; i ++)
{
double key = array [ i ];
i n t pos = i ;
while ( pos > 0 && array [ pos - 1] > key )
{
array [ pos ] = array [ pos - 1];
pos - -;
}
array [ pos ] = key ;
}
}
CHAPTER 3. SEARCHING AND SORTING 4

3.2.4 Comparison of times


There are many other sorting algorithms. The Java library includes its own sorting method
which uses an algorithm called quicksort.
To use the library version import java.util.Arrays and call Arrays.sort(a) where
a is any array of a numerical type.
Here is a table of times in seconds for sorting a random array of doubles:

n Array.sort Bubble Selection Insertion


100 0.002 0.001 0.001 0.001
1000 0.012 0.021 0.011 0.009
10,000 0.024 1.414 0.750 0.521
100,000 0.121 292 191 105
1,000,000 0.981 31,397 20,843 13,261

31,397 seconds is about 8.7 hours.

3.3 Sorting objects


To sort objects we need to specify their order. There are two standard ways to do this:

1. Make your objects implement the Comparable interface;

2. Provide a class that implements the Comparator interface.

3.3.1 Generics
When we are sorting we really want to compare Strings with Strings, Dates with Dates
Doubles with Doubles. We do not want to compare Dates with Doubles. In Java 1.4 it
was impossible to have one interface that covered all these cases. What we used was an
interface with a single method public int compareTo(Object x). We then assumed that
the object passed to compare to a String was another String, did a cast from Object to
String and then did our comparison. This avoided type-safety and was not desirable.
Java 1.5 introduced a new feature called generics or parameterised types. We can
now have a type with a parameter, which must be filled in with another type. So
the library provides an interface Interface Comparable<T> which declares a method
int compareTo(T o). We can choose a reference type to fill in for T.
For example we can declare

c l a s s myClass implements Comparable < String >


{
i n t compareTo ( String s )
{ // do something here
CHAPTER 3. SEARCHING AND SORTING 5

}
}
You must use the same type for T throughout.
The full use of generics is quite complicated, and in some cases, especially with arrays
it does not work very well, as it had to be compatible with Java 1.4. However using generic
types from the Java library is very easy and restores type safety.

3.3.2 The Comparable interface


A class that extends the Comparable<T> interface must provide an instance method
public int compareTo(T x). T will usually be the type of the implementing class. The
method should compare the object (this) with another object x of type T. It should return
positive if this is after x, zero if they are the same, and negative if this is before x.
As an example we will use a type that represents a rational number (fraction). We sort
our fractions in numerical order, so 1/3 < 1/2 < 2/3.

import java . util . Random ;

/* * RationalNumber
* Holds a rational number n/d , where
* n and d are ints .
* @author C.T. Stretch
*/
p ub li c c l a s s RationalNumber implements Comparable < RationalNumber >
{
/* * The numerator of the fraction */
p r i v a t e i n t numerator ;
/* * The denominator of the fraction */
p r i v a t e i n t denominator ;

/* * Constructs a new rational number equal to n/d


* Ensures that d is positive and n/d is in lowest form .
* @throws ArithmeticException if d is zero .
* @param n the numerator
* @param d the denominator
*/
RationalNumber ( i n t n , i n t d ) throws ArithmeticException
{
i f ( d ==0) throw new ArithmeticException (" Zero denominator " );
i f (d <0)
{
d=-d;
CHAPTER 3. SEARCHING AND SORTING 6

n=-n;
}
i n t g = gcd (d , n );
numerator = n / g ;
denominator = d / g ;
}

/* * Tests if two rationals are equal


* @see java . lang . Object # equals ( java . lang . Object )
*/
p ub li c boolean equals ( Object x )
{
i f ( x == n u l l ) return f a l s e ; // Should return false for null
i f ( getClass ()!= x . getClass ()) return f a l s e ; // and for x not a Ration
RationalNumber r =( RationalNumber ) x ;
return ( numerator == r . numerator && denominator == r . denominator );
}

/* * Find a hash code


* @see java . lang . Object # hashCode ()
*/
p ub li c i n t hashCode ()
{
return numerator ^ denominator ;
}

/* * Compares this rational with another


* @param x the other number
* @return 1 if this >x , 0 if this =x and -1 if this <x
*/
p ub li c i n t compareTo ( RationalNumber x )
{
long s =( long ) numerator *( long ) x . denominator -
( long ) denominator *( long ) x . numerator ;
i f (s <0) return -1;
i f (s >0) return 1;
return 0;
}

/* * Write the rational as a string


* @see java . lang . Object # toString ()
*/
p ub li c String toString ()
CHAPTER 3. SEARCHING AND SORTING 7

{
i f ( denominator ==1) return numerator +"";
return numerator +"/"+ denominator ;
}
}

81/32 12/49 1 77/13 77/73 7/2 25/33 1/3 82/89 12/11 55/96 7/31 3/7 19 30/13
81/35 17/13 25/66 92/7 11/10
Sorting
7/31 12/49 1/3 25/66 3/7 55/96 25/33 82/89 1 77/73 12/11 11/10 17/13 30/13 81/
35 81/32 7/2 77/13 92/7 19

Many library classes including String, Date, Time, Integer and Double implement the
comparable interface.
We can now modify our sorts to sort objects that implement comparable

p ub li c void insertionSort ( RationalNumber [] a )


{
f o r ( i n t i =1; i < a . length ; i ++)
{ RationalNumber key = a [ i ];
i n t pos = i ;
while ( pos >0&& a [ pos -1]. compareTo ( key ) >0)
{ a [ pos ]= a [ pos -1];
pos - -;
}
a [ pos ]= key ;
}
}

Overriding equals
The Object class has a method public boolean equals(Object x). This returns true if
this object is identical to the object x. Classes often want to override this method so that
it returns true if the two objects have the same state. For example the String.equals
method returns true if the two strings have the same length and the same characters. If you
implement the comparable interface then you should override equals to provide a method
that returns true when compareTo returns zero. Notice the method is passed an object, so
we will need to cast it to the correct type. If it is of the wrong type we should return false.
When you override equals you should also override another method
public int hashCode(). We will learn about this in the chapter on Hashing.
CHAPTER 3. SEARCHING AND SORTING 8

3.3.3 The Comparator interface


The Comparator<T> interface has a method public int compare(T o1, T o2) this com-
pares two objects of type T and returns negative if o1 is smaller, zero if they are equal and
positive if o1 is larger. The Comparator interface can be implemented in a separate class
than the one that describes the object to compare. This means you can have more than
one way of ordering objects of one type, for example you could order rational numbers by
the size of the numerator.
The Java library offers methods for sorting arrays and other data types. These come
in versions using Comparable and Comparator.
Arrays . sort ( Object [] a ) // sorts Comparable objects
Arrays . sort ( Object [] a , Comparator c ) // sorts with a comparator

3.4 Analysis of algorithms


In order to compare the efficiency of algorithms we need to think carefully about what we
are comparing.
We can consider the space efficiency, that is how much memory does it use, it is usually
more important to consider the time efficiency, how long does it take.
Timing an algorithm depends on the computer we run it on. The times on the same
computer can also vary depending on what other programs are running.
We will be interested in how the running time varies as the size of the problem varies.
First we have to decide how to measure the size of the problem. We usually call this n.
For a sort we can take n to be the length of the array.
Instead of the time we select a suitable operation and measure the number of times it
is used. For sorts we use the number of comparisons.
The number of operations may depend on the data-set, for example a sort may run
faster or slower when the data is nearly sorted to start.
We can ask for

Worst-case performance Look at the data that gives the worst performance for this
algorithm.

Average-case performance Look at the average performance over all possible data.

Best-case performance Look at the data that gives the best performance.

For example look at the insertion sort. For each value of i from 1 to n 1 we compare
key with the i previously sorted values. We stop when we find a value less than or equal
to the key. In the worst case (when the data is in the opposite order) we have to compare
with all i values. In the best case (when the data is already in order) we only need 1
comparison. In the average case we need about i/2 comparisons.
To get the total number of comparisons add this up for all values of i.
CHAPTER 3. SEARCHING AND SORTING 9

Worst-case 1 + 2 + + n 1 = n(n 1)/2 = 21 n2 12 n


1
Average-case 2
+ 22 + + n1
2
= n(n 1)/4 = 14 n2 41 n

Best-case 1 + 1 + + 1 = n 1

In practice we do not need to do the calculations to this accuracy. We only need the
figures for large values of n, in which case n2 is much bigger than n and we can ignore
the n part when we have a n2 term. Also we can ignore the multiple of n2 , the worst case
could be done in the same time as the average case if our computer was twice as fast.
We say the worst case and average case insertion sort are of order n2 . We write this as
O(n2 ), (big-O n2 ).
The best case insertion sort is O(n).
The selection sort and bubble sort are both O(n2 ) in all three cases.
Note O(n2 ) = O(3n2 ) = O(n2 /2) = O(n2 + 2n + 3), but O(n2 ) > O(n)

Logarithms
Many algorithms have orders that involve logarithms. Remember that 2n is 2 multiplied
by itself n times. Some useful values are:
n 2n
1 2
2 4
3 8
4 16
8 256
16 65536
32 4294967295

The logarithm to the base 2 of n is the power of 2 that gives n. For example 23 = 8 so
log2 (8) = 3. You can do logarithms to any base, but computer scientists usually use base
2.
If a number is between two powers of 2 its logarithm will not be a whole number, for
example log2 (10) is between 3 and 4, in fact it is 3.32192...
Java has a method double log(double x) in class Math that gives natural logarithms.
These are logarithms to the base e 2.71828. You can find log2 (x) using log2 (x) = log e (x)
log (2)
.
e

/* * LogBaseTwo
* Find logs to base 2.
*/
p ub li c c l a s s LogBaseTwo
{
/* * Constant to divide by log (2) */
CHAPTER 3. SEARCHING AND SORTING 10

f i n a l s t a t i c double FACTOR =1/ Math . log (2);

p ub li c s t a t i c void main ( String [] args )


{
System . out . println (" The log to the base two of 8 is "+ log2 (8));
}

/* * Find the log to the base 2 of x.


* uses log2 (x )= log (x )/ log (2)
* @param x The number to take the log of .
* @return the log to the base 2.
*/
p ub li c s t a t i c double log2 ( double x )
{
return Math . log ( x )* FACTOR ;
}
}

The log to the base two of 8 is 2.9999999999999996


We can now look at a table of some common orders of algorithms in order, from fastest
to slowest, together with what happens if you double n.
Order Name Double n
O(1) constant time Takes the same time for any n.
O(log(n)) logarithmic time adds 1.
O(n) linear time doubles
O(n log(n)) add 1 and double
2
O(n ) quadratic time multiply by 4
3
O(n ) cubic time multiply by 8
O(2n ) exponential time Squares
Typically algorithms that involve a single loop from 1 to n are O(n). Algorithms that
involve a loop to n inside a loop to n are O(n2 ).
Remember that the order only tells you about the speed of the algorithm for large n. It
is possible that a constant time algorithm could be slower than an exponential algorithm
for small n.

3.5 Fast sorts


3.5.1 Introduction
We will look at a couple of faster sorting algorithms. Both have average case order
O(n log(n)) and so are considerable faster than our previous algorithms. Both algorithms
are programmed with a powerful technique called recursion that we will look at next.
CHAPTER 3. SEARCHING AND SORTING 11

3.5.2 Recursion
A method is recursive if it calls itself. As an example we will consider a method to calculate
the factorial of an integer, n!. The factorial of n is the product of the numbers up to n, so
6! = 1 2 3 4 5 6 = 720
The factorial satisfies n! = n (n 1)!, or factorial(n) = n factorial(n 1) we can
use this for a recursive calculation.
There is one important feature for any recursive calculation. If a method always calls
itself, then that second call will also call itself, and so will the third , the method will
try to continue calling itself forever and the program will crash. To avoid this a recursive
program should always have a base case where the method can finish without recursion.
This is usually given an if statement.
In our example if n = 1 or n = 0 we know factorial(n) = 1 and do not need to use
recursion.
/* * Calculates the factorial of an integer .
* Uses a recursive algorithm .
* The factorial of n is the product of the positive integers
* up to n.
* @param n The value of n must be between 0 and 12.
* @return The factorial .
*/
s t a t i c i n t factorial ( i n t n )
{
i f ( n <= 1)
return 1;
return n * factorial ( n - 1);
}
Any method that uses recursion can be rewritten as a non-recursive function. In some
cases this is easy. It is generally a good idea to avoid recursion if it is easy to do so. We
can rewrite our factorial:
/* * Calculates the factorial of an integer .
* Uses a non - recursive algorithm .
* The factorial of n is the product of the positive integers
* up to n.
* @param n The value of n must be between 0 and 12.
* @return The factorial .
*/
s t a t i c i n t factorialN ( i n t n )
{
i n t f = 1;
f o r ( i n t i = 2; i <= n ; i ++)
f = f * i;
CHAPTER 3. SEARCHING AND SORTING 12

return f ;
}
We have replaced the recursion by a loop. Factorials are very big numbers.
13!=6227020800 is too big to fit in an int. In practice it does not matter which of these
algorithms we use, they are both fast as we can only use such small n. If we change
factorial to return a long we still only get to n = 20. If we use a double version of the
method it would be quicker and use less memory to use the loop.
For another example the Fibonacci numbers are the sequence
1,1,2,3,5,8,13,21,34,55,89, , where each number is the sum of the two before. We
can calculate these using a recursive method with base cases 1 and 2.
/* * Calculates the n^ th Fibonacci number .
* Uses a recursive algorithm .
* fib (1)=1 , fib (2)=1
* fib (n )= fib (n -1)+ fib (n -2)
* @param n The value of n must be between 1 and 46.
* @return The Fibonacci number .
*/
s t a t i c i n t fib ( i n t n )
{
i f ( n <= 2)
return 1;
return fib ( n - 1) + fib ( n - 2);
}
We can also use a non-recursive method:
/* * Calculates the n^ th Fibonacci number .
* Uses a non - recursive algorithm .
* fib (1)=1 , fib (2)=1
* fib (n )= fib (n -1)+ fib (n -2)
* @param n The value of n must be between 1 and 46.
* @return The Fibonacci number .
*/
s t a t i c i n t fibN ( i n t n )
{
i f ( n <= 2)
return 1;
i n t [] f = new i n t [ n + 1];
f [1] = 1;
f [2] = 1;
f o r ( i n t i = 3; i <= n ; i ++)
f [ i ] = f [ i - 1] + f [ i - 2];
return f [ n ];
CHAPTER 3. SEARCHING AND SORTING 13

}
In this case the non-recursive method is much faster.
In some cases it is sensible to use recursion because the data is recursive for example
the tree of files on a disk. The next program lists the files in a directory and in all its
subdirectories. As we can have directories inside directories inside directories... it is natural
to use recursion.
import uucPack . InOut ;
import java . io . File ;

/* * List a directory recursively


*/

p ub li c c l a s s FileTree
{
/* * Each level is indented 4 spaces . */
f i n a l s t a t i c String INDENT =" ";

p ub li c s t a t i c void main ( String [] args )


{ InOut . print (" Enter file or directory name : " );
String name = InOut . readString ();
InOut . println (" Recursive listing of "+ name );
tree ( name );
}

/* * List a file or directory recursively


* This non - recursive method sets up the parameters
* for the recursive method .
* @param name The name of the file or directory
*/
s t a t i c void tree ( String name )
{ tree (new File ( name ) ,"" );
}

/* * This recursive method lists a file ( the base case )


* or directory .
* @param f A File object representing a file or directory .
* @param indent A string of spaces to indent the listing .
*/
p r i v a t e s t a t i c void tree ( File f , String indent )
{ InOut . println ( indent + f . getName ());
i f ( f . isDirectory ())
{ File [] files = f . listFiles ();
CHAPTER 3. SEARCHING AND SORTING 14

f o r ( i n t i =0; i < files . length ; i ++)


tree ( files [ i ] , indent + INDENT );
}
}
}

Enter file or directory name: T:\semester1\com328c1


Recursive listing of T:\semester1\com328c1
com328c1
Lectures
asd1.pdf
asd2.pdf
Organisation
modhand05.pdf
COM328ModDesc.doc
Labs
IntroductiontoEclipse.doc
Week1.doc
chrisstyle.xml
Week2.doc
work328
Tutorial
Week1Problems.doc
Week2Problems.doc
Exams
com328ex03.pdf
com328ex04.pdf
com328
venn
Venn.java
Expression.java
Diagram.java
Language.java
BadExpressionException.java
Operation.java
TruthTable.java
graphics
GraphicsFrame.java
GraphPaper.java

Note that when a recursive method calls itself the original method has not yet returned.
Each call of the method creates its own set of local variables and parameters, so that when
the program prints out Venn.java there are four different copies of f, indent, files and i.
CHAPTER 3. SEARCHING AND SORTING 15

3.5.3 Merge sort


Merge sort is based on the following observation: Suppose we have an array to sort. We
can split into 2 halves and sort each half. It is then easy to merge the two sorted halves
into one sorted array, we just keep comparing the bottom element in each half and move
the smallest into a new array.
If we compare this with an O(n2 ) sort we see two half size arrays take 2 14 = 12 of the
time as one big array. The merge stage is O(n) and so does not matter.
In fact we do not sort the halves with a O(n2 ) sort, we split the halves up again and
use another merge sort. We can keep doing this until each array part has only one element
and so is already sorted.
For example we will sort an array of length 7 from descending order to ascending order.
In the diagram each array is either split into the two below it or merged from the two
above. Note the two parts we merge are always in order.

7654321
765 4321
7 65 43 21
7 6 5 4 3 2 1
7 56 34 12
567 1234
1234567

In the code the call mergeSort(array,first,next) sorts the part of the array from
first up to but not including next. The merge(first,split,next) merges the part from
first up to but not including split with the part from split up to but not including
next.
The merged data is copied into the temp array, which is then copied back to the original
array.
// Sort an array of doubles with a merge sort
s t a t i c void mergeSort ( double [] array )
{
i n t n = array . length ;
temp = new double [ n ];
mergeSort ( array , 0 , n );
temp = n u l l ;
}

s t a t i c p r i v a t e double [] temp ;

p r i v a t e s t a t i c void mergeSort ( double [] array , i n t first , i n t next )


{
i f ( next > first + 1)
CHAPTER 3. SEARCHING AND SORTING 16

{
i n t split = ( first + next ) / 2;
mergeSort ( array , first , split );
mergeSort ( array , split , next );
merge ( array , first , split , next );
}
}

p r i v a t e s t a t i c void merge ( double [] array ,


i n t first , i n t split , i n t next )
{
i n t a = first ;
i n t b = split ;
i n t c = 0;
while ( a < split && b < next )
{
i f ( array [ a ] <= array [ b ])
temp [ c ++] = array [ a ++];
else
temp [ c ++] = array [ b ++];
}
while ( a < split )
temp [ c ++] = array [ a ++];
while ( b < next )
temp [ c ++] = array [ b ++];
f o r ( i n t i = 0; i < c ; i ++)
array [ first + i ] = temp [ i ];
}
Notice that the merge sort needs a second array of the same size as the original array.
The earlier sorts did not need extra space.
How many comparisons do we need for a merge sort? At each level in the diagram
above we need about n comparisons in the merge step. How many levels are there? This
is the number of times we need to halve n to get 1, which is the number of times to double
1 to get n, which is log2 (n).
So merge sort has order O(n log(n)) and for large n it is considerable faster than our
earlier sorts.

3.5.4 Quicksort
Quicksort uses a different method of splitting an array into two. If you want to sort a pile
of named scripts into alphabetical order one method is to start by splitting into two piles
A-L and M-Z. We can then sort the two piles, possibly by splitting again.
CHAPTER 3. SEARCHING AND SORTING 17

Quicksort works like this. It choses a value and moves everything less than this to the
start of the array and everything greater to the end. A tricky problem is what value to
use (the pivot). We want a value as near to the middle value of the data as possible (the
median). The simplest choice is to use the first value in the array. If the data is random
this will be as good as any other.
The pivot lies between the two parts and is not put in either, this ensures that the
parts are both smaller than the array we are splitting.
When we have completed one splitting we can recursively apply quicksort to the parts
until we get one element parts.
s t a t i c void quicksort ( double [] array )
{
quicksort ( array , 0 , array . length - 1);
}

p r i v a t e s t a t i c void quicksort ( double [] array , i n t first , i n t last )


{
i f ( last > first )
{
i n t split = separate ( array , first , last );
quicksort ( array , first , split - 1);
quicksort ( array , split + 1 , last );
}
}

p r i v a t e s t a t i c i n t separate ( double [] array , i n t first , i n t last )


{
double pivot = array [ first ];
i n t a = first +1 , b = last ;
i n t split ;
double temp ;
boolean done = f a l s e ;
while (! done )
{
while (( a < b ) && ( array [ a ] <= pivot ))
a ++;
while (( a < b ) && ( array [ b ] >= pivot ))
b - -;
i f (a < b)
{
temp = array [ a ];
array [ a ] = array [ b ];
array [ b ] = temp ;
a ++;
CHAPTER 3. SEARCHING AND SORTING 18

b - -;
}
else
done = true ;
}
i f ( array [ b ] < pivot )
split = b ;
else
split = b - 1;
array [ first ] = array [ split ];
array [ split ] = pivot ;
return split ;
}
This version of quicksort has average order O(n log(n)), and is faster than a mergesort
for most data. Unfortunately it is very bad (O(n2 )) for data that is already sorted. This
is due to the choice of pivot. If the data is sorted all the split will be on one side of the
pivot, and we will only reduce the size by one for each recursion. In fact for large n the
method will crash as java has a limit on the number of recursions it allows.
We can fix this by selecting a better pivot. We can select the value from the middle of
the array. When we have selected the pivot we swap it with the first element.
An even better choice is to look at the first, last and middle positions in the array, we
select the middle value of these three and swap this to the start. (The median-of-three
pivot).
How much extra memory does quicksort require? It requires memory to hold the
variables for each recursion. If we choose our pivot well this is O(log(n)) and is much less
than mergesort.
Both mergesort and quicksort are too complicated to be quick for small arrays. We can
speed them up by swapping to an insertion sort for small n.

Examples
Sort 4,1,6,8,2,7,9,5,3 using the first element as pivot.
We start with a pivot of 4. The object is to get everything less than 4 to the start of
the array and everything greater than 4 to the end. (Values of exactly 4 can go to either
side.) Searching the rest of the array from both ends we need to swap 6 with 3 and 8 with
2. We have now split the remainder into 1,2,3 and 8,7,9,5,6. We then swap the pivot with
the last element in the first block. giving 3,1,2, the pivot 4 then 8,7,9,5,6. The pivot is now
in the correct place in the array. We now sort the two blocks separately using quicksort
again. We start with pivots of 2 and 8.
In figure 3.1 the pivots are shown in circles, old pivots are shown in boxes. Lines above
the numbers show swaps while sorting, lines below swap the pivot.
Sort 1,3,2,4,5,7,6,8,9 using median of 3 pivots.
CHAPTER 3. SEARCHING AND SORTING 19


4

1 6 8 2 7 9 5 3
 
2

1 3 4 8

7 9 5 6

1 2 3 4 5

7 6 8 9

1 2 3 4 5 7

6 8 9

1 2 3 4 5 6 7 8 9

Figure 3.1: Quicksort example

Note if we took the first element as pivot we would only reduce the size of the array
by 1. The median of 1,5,9 is 5. Swap this to the front and use it as pivot. No swaps take
place, we swap the pivot back with 1 and we have to sort 3,2,4,1 and 7,6,8,9 separately.
We do this using 2 quicksorts with pivots 2 and 7.

3.6 Searches
3.6.1 A linear search
The most obvious method to find a value in an array is the linear search. This tries each
element in turn.
/* *
* Finds the index of the first value x in array a.
* @param a The array to search .
* @param x The value to find .
* @return The first index of the required value . -1 if not found .
*/
s t a t i c i n t findElement ( i n t [] a , i n t x )
{ f o r ( i n t i =0; i < a . length ; i ++) i f ( a [ i ]== x ) return i ;
return -1;
}
This algorithm is clearly O(n) in the average and the worst case.
CHAPTER 3. SEARCHING AND SORTING 20

3.6.2 Binary search


If the array is sorted there is a much faster algorithm we can use.
The algorithm is called binary search. We are trying to find a value in a sorted array.
Look at the element in the middle of the array. If our value is past this we need only look
in the second half of the array. If it is before the middle we only look at the first half. We
then repeat this with the middle of our half array to cut it to a quarter, we keep halving
the region to look until we have a single element.
We could write this as a recursive method, but it is easy to do as a loop.
/* * Search for a value in a sorted array of ints
* @param a The array to search .
* @param x The value to find .
* @return an index or -1 if not found
*/
s t a t i c i n t binarySearch ( i n t [] array , i n t value )
{ i n t bottom =0;
i n t top = array . length -1;
i n t middle ;
do
{ middle =( top + bottom )/2;
i f ( array [ middle ]== value ) return middle ;
i f ( array [ middle ] > value )
top = middle -1;
else
bottom = middle +1;
} while ( bottom < top );
return -1;
}
The variables bottom and top hold the limits of the piece of the array we are searching.
The table below shows a search for 7 in an array 1,2,3,3,4,5,7,8,9,9.
The middle value at each stage is in bold. Only the values between top and bottom
are shown.

0 1 2 3 4 5 6 7 8 9
1 2 3 3 4 5 7 8 9 9
5 7 8 9 9
5 7
7

As the number of steps is the number of times to halve n to get 1 we see the algorithm
has average and worst case order O(log(n)).
We will see later we can do even better searches if we use a more complicated data
structure than a sorted array.
CHAPTER 3. SEARCHING AND SORTING 21

Note that it is not worth sorting the array to do one search. This would take much
longer than a linear search. This method is useful if we need to do many searches of the
same array.

Вам также может понравиться