Вы находитесь на странице: 1из 114

UNIT1: FUNDAMENTALS OF ALGORITHM

Structure

• 1.0 Objectives

• 1.1 Introduction to algorithm

• 1.2 Properties of algorithm

• 1.3 Algorithmic Notations

• 1.4 Design and development of an algorithm

• 1.5 Some simple examples

• 1.6 Summary

• 1.7 Keywords

• 1.9 Unit- end exercises and answers

1.0 OBJECTIVES

At the end of this unit you will be able to

 ∑ Fundamentals of algorithms along with notation. ∑ Te various properties of an algorithm. ∑ How to write algorithm or pseudo code for any problem. ∑ Algorithms for varieties of problems.
• 1.1 Introduction to algorithm:

An algorithm, named for the ninth century Persian mathematician al-Khowarizmi, is simply a set of rules used to perform some calculations, either by hand or more usually on a machine Even ancient Greek has used an algorithm which is popularly known as Euclid’s algorithm for calculating the greatest common divisor(gcd) of two numbers.

An algorithm is a tool for solving a given problem.Before writing a program for solving the given problem, a good programmer first designs and writes the concerned algorithm, analyses it refines it as many times as required and arrives at the final ‘efficient ‘ form that works well for all valid input data and solves the problem in the shortest possible time, utilizing minimum memory space.

Definition of algorithm: The algorithm is defined as a collection of unambiguous instructions occurring in some specific sequence and such an algorithm should produce output for given set of input in finite amount of time.

The basic requirement is that the statement of the problem is to be made very clear because certain concepts may be clear to some one and it may not be to some body else. For example, calculating the roots of a quadratic equation may be clear to people who know about mathematics, but it may be unclear to some one who is not.

A good algorithm is like sharp knife – it does exactly what it is supposed to do with a minimum amount of applied effort. Using the wrong algorithm to solve a problem is like trying to cut a steak with a screwdriver. You obtain a result, but you would have spent more effort than necessary.

Any algorithm should consist of the following:

• 1. Input : The range of inputs for which an algorithm works perfectly.

• 2. Output : The algorithm should always produce correct results and it should halt.

• 3. A finite sequence of the instructions that transform the give input to the desired output (Algorithm + Programming lanuage)

Usually, algorithm will be written in simple English like statements along with simple mathematical expressions. The definition of Algorithm can be illustrated using the figure 1.1

Input Problem --Algorithm --Computer --Output

(

Fig 1.1

Notion of the Algorithm

)

Any systematic method for calculating the result a can be considered as an algorithm. For example, the methods that we learn in school for adding, multiplying, dividing numbers can be considered as algorithms. By looking at the steps specified we can achieve for the result without even thinking. Even a cooking recipe can be considered as an algorithm if the steps:

• 1. Describe precisely how to make a certain dish.

• 2. Describe the exact quality to be used.

• 3. Detail instructions of what items to be added next at what time? How long to cook?

• 1.2 Properties of algorithms:

Each and every algorithm has to satisfy some properties. The various properties or the characteristics of an algorithms are:

• 1. Precise and unambiguous (Definiteness) :An algorithm must be simple, precise and unambiguous i.e. there should not be any ambiguity(dout) in the instructions or statements specified to solve a problem. Each and every instruction used in the algorithm must be clear and unambiguous.

2.

Range of inputs : The range of inputs for which the algorithm produce the desired result should be specified.

• 3. Maintain order : The instructions in each and every step in an algorithms are I specified order i.e. they will be executed in sequence (i.e. one after the other). The instructions cannot be written in random order.

• 4. Finite and correct : They must solve the problem in certain finite number of steps and produce the appropriate result. The range of input for which the algorithm works perfectly should be specified.

• 5. Termination : ecac algorithm should terminate.

• 6. Several algorithms may exist for solving a given problem and execution speed of each algorithm may be different. (for example, to sort various algorithms bubble sort, insertion sort can be used).

• 7. An algorithm can be represented in several different ways.

• 8. Algorithm for a given problem can be based on very different ides (for example, to sort several methods exist such as bubble sort, insertion sort, radix sort etc.) and may have different execution speeds.

• 1.3 Algorithmic Notations

The following notations are used usually while writing any algorithm.

• 1. Write the word algorithm and write what is the main objective of the algorithm. For example,

Algorithm Area_of_circle

• 2. Then a brief description of what is achieved using the algorithm along with the inputs to the algorithm has to be provided. For example, Description : “The algorithm computes the area of circle using the input value

• 3. Each i9nstruction should be in a separate steps and the step number has to be provided. What is accomplished in each step has to be described in brief and has to be enclosed within the square brackets (which we call as comment). For example, to find the area of circle, we can write: Step2: [ Find the area of circle] Area3.142*radius*radius.

• 4. After all operations are over, the algorithm has to be terminated which indicates the logical end of the algorithm. For example , the last step in the algorithm will be: Step4: [Finished] exit.

• 1.4 Design and development of an algorithm

The fundamental steps in solving any given problem which leads to the complete development of an algorithm, are as follows:

1.

Statement of the problem.

• 2. Development of a mathematical model

• 3. Designing of the algorithm

• 4. Implementation

• 5. Analysis of the algorithm for its time and space complexity

• 6. Program testing and debugging

• 7. Documentation.

• 1. Statement of the problem.

Before we attempt to solve a given problem, we must understand precisely the statement of the algorithm. There are again several ways to do this. We can list all the software specification requirements and try to ask several questions and get the answers. This would help us to understand the problem more clearly and remove any ambiguity.

• 2. Development of a mathematical model

Having understood the problem, the next step is to look for a mathematical model, which is best suit for the given problem. This is a very important step in the overall solution process and it should be given considerable thought. In fact the choose of the model has a long way to go down in the development process. We must think of -which mathematical model is best suit for any given problem ? - Are there any model which has already been selected to solve a problem which resembles the current one?

• 3. Designing of the algorithm

As we are comfortable with the specification and the model of the problem at this stage, we can move on to writing down an algorithm.

• 4. Implementation

In this step appropriated data structures are to be selected and coded in a target language. The select of a target language is very important sub step to reduce complexities involved in coding.

• 5. Analysis of the algorithm for its time and space complexity

We will use, in this section, a number of term s like complexity, analysis , efficiency, etc. All of these terms refer to the performance of a program. Our job cannot stop once we write the algorithm and code it in say C or C++ or Java. We should worry about the space3 and timing requirement too. Why? There are several reasons for this and so we shall start with the time complexity. In simple terms time complexity of a program is the amount of computer times it needs to run a program.

The space complexity of a program is the amount of memory needed to run a program.

• 6. Program testing and debugging

After implementing the algorithm in a specific language, next is the time execute. After executing the program the desired output should be obtained. Testing is nothing about the verification of the program for its correctness i.e. whether the output of the program is correct or not. Using different input values, one can check whether the desired output is obtained or not. Any logical error can be identified by program testing. Usually debugging is part of testing. Many debugging tool exist by which one can test the program for its correctness.

7. Documentation

Note that the documentation is not last step. The documentation should exist for understanding the problem till it is tested and debugged. During design and implementation phase the documentation is very useful. To understand the design or code, proper comments should be given. As far as possible program should be self- documented. So, usage of proper variable name and data structures play a very important role during documentation. It is very difficult to read and understand the others logic and code. The documentation enables the individuals to understand the programs written by the other people.

• 1.5 Some simple examples

• 1. Algorithm to find GCD of two numbers.( Euclid’s algorithm). ALGORITHM : gcd(m,n) //Purpose :To find the GCD of two numbers //Description : This algorithm computes the GCD of two non-negative and non-zero values accepted as parameters. //Input : Two non-negative and non-zero values m and n //Output : GCD of m and n Step1 : if n=0 return m and stop Step2: Divide m by n and assign the remainder to r. Step3: Assign the value of n to m and the value of r to n Step4: Go to step1.

• 2. Algorithm to find GCD of two numbers.(Consecutive integer checking method).

ALGORITHM : gcd(m,n) //Purpose :To find the GCD of two numbers //Description : This algorithm computes the GCD of two non-negative and non-zero values accepted as parameters. //Input : Two non-negative and non-zero values m and n //Output : GCD of m and n

Step1 : [find the minimum of m and n] rmin(m,n); Step2: [find the gcd using consecutive integer checking]

While(1)

if (m mod r = 0 and n mod r = 0) break; end while Step 3: return r.

• 3. Algorithm to find GCD of two numbers.(Repetative subtraction method) ALGORITHM : gcd(m,n) //Purpose :To find the GCD of two numbers //Description : This algorithm computes the GCD of two non-negative and non-zero values accepted as parameters. //Input : Two non-negative and non-zero values m and n //Output : GCD of m and n Step1 : [If one of the two numbers is zero, return non-negative number a the GCD] if (m=0) return n;

if (n=0) return m; Step2 :[Repeat step 2 while m and n are different] While (m!=n) if(m>n) mm-n; else nn-m; end if end while Step 3: [finished : return GCD as the output] return m;

Note: Same problem can be solved in many ways(example Algorithm 1,2 and 3).

4. Algorithm to generate prime numbers using sieve Eratosthenes method. (Pseudo code)

ALGORITHM SIEVE_PRIME(n) //Purpose : To generate prime numbers between 2 and n //Description : This algorithm generates prime numbers using sieve method //Input : A positive integer n>=2 //Output : Prime numbers <=n

Step 1: [ Generate the list of integers from 2 to n ]

 for p 2 to n do a[p]p end for Step 2: [Eliminate the multiples of p between 2 to n ] for p 2 to √n do

if (a[p] != 0 ) ip*p while ( i <= n )

a[i]0 ii + p end while end if end for Step 3: [Obtain the prime umbers by copying the non zero elements] m 0

 for p 2 to n do if (a[p] != 0 )

b[m]a[p]; mm + 1 end if end for Step 4: [Output the prime numbers between 2 to n ]

for

i0 to m-1

write b[i] end for Step 5 : [Finished ] Exit

• 5. Algorithm to find the number of digits in a binary representation of a given decimal integer Algorithm : Binary(n) //Purpose : To count the number of digits in a binary representation of a given decimal integer. //Input : n : a positive decimal integer.

//Output : Number of digits in a binary representation of a given positive decimal integer.

 Count  1; While ( n > 1) Count  Count + 1 n  n / 2 end while return Count

 1. What is an algorithm ? Explain the notion of the algorithm? 2. What are various properties of an algorithm? 3. Explain the procedure of generating prime numbers using the method “Sieve of eratosthenes’ and write the algorithm for the same. 4. Explain the steps involved in the design and development of an algorithm. 1.6 SUMMARY ∑ Algorithm : An algorithm is a sequence of non-ambiguous instructions for solving a problem in a finite amount of time. An input to an algorithm specifies an instance of the problem the algorithm solves. ∑ Algorithm can be specified in a natural language or a pseudo code; they can also 1.7 KEYWORDS ∑ be implemented as computer programs. A good algorithm is usually a result of repeated efforts and rework. ∑ The same problem can often be solved by several algorithms. For example, three algorithms were given for computing the greatest common divisor of two integers: Euclid’s algorithm, the consecutive integer checking algorithm, and repetitive subtraction. Algorithm : It is a sequence of unambiguous instructions to solve a problem in afinite amount of time. Time complexity : It is the time required to execute a program. Space complexity : It is the amount of memory needed to run a program. 1.8 ANSWERS TO CHECK YOUR PROGRESS 1. 1.1 2. 1.2

3.

1.5(4 th algorithm)

• 4. 1.4

• 1. Find gcd(31415,14142) by applying Euclid’s algorithm.

• 2. What does Euclid’s algorithm do for a pair of numbers in which the first number is smaller than the second one? What is the largest number of times this can happen during the algorithm’s execution on such an input?

• 3. Write an algorithm to find the gcd of two numbers by using repetitive subtraction method. Find gcd(36,171) by using repetitive subtraction

• 4. Write an algorithm to find the number of digits in a binary representation of a given decimal integer. Trace it for the input 255.

• 1. 1.5 (1 st algorithm)

• 2. 1.5 (1 st algorithm) [ Hint : find gcd(12,24) ]

• 3. 1.5 (3 rd algorithm)

• 4. 1.5 (5 th algorithm)

• 1. Introduction to the Design & Analysis of Algorithms By

Anany Levitin

• 2. Aho, Alfred V., "The design and analysis of computer algorithm".

• 3. Analysis and design of algorithms By A M Padma Reddy.

MODULE-1,UNIT 2 ANALYSIS OF ALGORITHM EFFICIENCY

Structure

• 1.0 Objectives

• 1.1 Introduction to algorithm

• 1.2 Properties of algorithm

• 1.3 Algorithmic Notations

• 1.4 Design and development of an algorithm

• 1.5 Some simple examples

• 1.6 Summary

• 1.7 Keywords

• 1.9 Unit- end exercises and answers

1.0

OBJECTIVES

At the end of this unit you will be able to

 ∑ Efficiency of an Algorithm. ∑ Space complexity. ∑ Time complexity. ∑ Performance measurement ∑ Need for time complexity ∑ Worst-case , Best-case and Average-case efficiencies. ∑ Asymptotic Notations.  Big-Oh (O)  Big-Omega (Ω)  Big-Theta (θ) ∑ Practical complexities. ∑ Analysis of iterative algorithms. ∑ Analysis of Recursive algorithms.

1.1

INTRODUCTION

Two important ways to characterize the effectiveness of an algorithm are its space complexity and time complexity. Time complexity of an algorithm concerns determining an expression of the number of steps needed as a function of the problem size. Since the step count measure is somewhat coarse, one does not aim at obtaining an exact step count. Instead, one attempts only to get asymptotic bounds on the step count. Asymptotic analysis makes use of the O (Big Oh) notation. Two other notational constructs used by computer scientists in the analysis of algorithms are Θ (Big Theta) notation and Ω (Big Omega)notation.

1.2

Space complexity

The space complexity of a program is the amount of memory that may be required to run

a program.

• 1. The primary memory of a computer is an important resource for the proper execution of a program. Without sufficient memory either the program works slowly or may not work totally. Therefore, the exact memory requirement for a program is to be in advance.

• 2. When we design a program we must see that memory requirement is kept to the minimum so that even computers with less memory can execute the program.

• 3. Now-a-days the operating systems take care of the efficient usage of memory based upon the virtual memory concept or dynamic linking and loading.

 1.2.1 Analysis of space complexity The following components are important in calculating the space requirements: ∑ Instruction space This is the space required to store the machine code generated by the compiler. Generally the object code will be placed in the code segment. ∑ Data space The space needed for constants, static variables, intermediate variables, dynamic variables etc. This is nothing but the data segment space. ∑ Stack space To store return address, return values, etc. To store these details, a stack segment will be used. 1.2.2 How to calculate space complexity?

Before we proceed to any specific example, we must understand the importance of the size of the input, that is, n. generally every problem will be associated with n. it may refer to

 ∑ Number of cities –in travelling salesman problem. ∑ Number of elements – in the sorting and searching problem. ∑ Number of cities – coloring the map problem. ∑ Number of objects – knapsack problem.

When a problem

is independent of n, then the data space occupied by the

algorithm/program may be considered as zero. Let us start with few simple problems

which are iterative type.

EXAMPLE: 1. Finding the average of three numbers. Void main() {

int a,b,c,avg; scanf(“%d %d%d”,a,b,c);

avg=(a+b+c)/3;

printf(“average is=%d”,avg); }

Program to illustrate the space complexity.

As a, b, c and avg are all integer variables, the space occupied by them is

=4*sizeof(int)

=4*2bytes

=8bytes

Space occupied by the constant 3 is = 1*2 bytes. Hence, total space is = 8 + 2 = 10 bytes.

• 1.3 Time complexity

It is the amount of time a program or algorithm takes for execution. That is how fast an

algorithm runs. Note that the time taken by a program for compilation is not included in the calculation. Normally researchers give more attention to time efficiency rather than space efficiency, because handling memory problems is easier than time.

• 1.4 ASYMPTOTIC NOTATIONS

Two important ways to characterize the effectiveness of an algorithm are its space complexity and time complexity. Time complexity of an algorithm concerns determining an expression of the number of steps needed as a function of the problem size. Since the step count measure is somewhat coarse, one does not aim at obtaining an exact step count. Instead, one attempts only to get asymptotic bounds on the step count. Asymptotic analysis makes use of the O (Big Oh) notation. Two other notational constructs used by computer scientists in the analysis of algorithms are Θ (Big Theta) notation and Ω (Big Omega) notation. The performance evaluation of an algorithm is obtained by totaling the number of occurrences of each operation when running the algorithm. The performance of an algorithm is evaluated as a function of the input size n and is to be considered modulo a multiplicative constant.

The following notations are commonly use notations in performance analysis and used to characterize the complexity of an algorithm.

Θ-Notation (Same order)

This notation bounds a function to within constant factors. We say f(n) = Θ(g(n)) if there exist positive constants n 0 , c 1 and c 2 such that to the right of n 0 the value off(n) always lies

between c 1 g(n) and c 2 g(n) inclusive.

In the set notation, we write as follows:

Θ(g(n)) = {f(n) : there exist positive constants c 1 , c 1 , and n 0 such that 0 ≤ c 1 g(n) ≤ f(n) ≤ c 2 g(n) for all n n 0 }

We say that is g(n) an asymptotically tight bound for f(n).

Graphically, for all values of n to the right of n 0 , the value of f(n) lies at or above c 1 g(n) and at or below c 2 g(n). In other words, for all n n 0 , the function f(n) is equal to g(n) to within a constant factor. We say that g(n) is an asymptotically tight bound for f(n).

In the set terminology, f(n) is said to be a member of the set Θ(g(n)) of functions. In other words, because O(g(n)) is a set, we could write

f(n) Θ(g(n))

to indicate that f(n) is a member of Θ(g(n)). Instead, we write

f(n) = Θ(g(n))

to express the same notation.

Historically, this notation is "f(n) = Θ(g(n))" although the idea that f(n) is equal to something called Θ(g(n)) is misleading.

Example: n 2 /2 − 2n = (n 2 ), with c 1 = 1/4, c 2 = 1/2, and n 0 = 8.

Ο-Notation (Upper Bound)

This notation gives an upper bound for a function to within a constant factor. We write f(n) = O(g(n)) if there are positive constants n 0 and c such that to the right of n 0 , the value of f(n) always lies on or below c g(n).

In the set notation, we write as follows: For a given function g(n), the set of functions

Ο(g(n)) = {f(n): there exist positive constants c and n 0 such that 0 ≤ f(n) ≤ c g(n) for all n n 0 }

We say that the function g(n) is an asymptotic upper bound for the function f(n). We use Ο-notation to give an upper bound on a function, to within a constant factor.

Graphically, for all values of n to the right of n 0 , the value of the function f(n) is on or below g(n). We write f(n) = O(g(n)) to indicate that a function f(n) is a member of the set Ο(g(n)) i.e.

f(n) Ο(g(n))

Note that f(n) = Θ(g(n)) implies f(n) = Ο(g(n)), since Θ-notation is a stronger notation than Ο-notation.

Example: 2n 2 = Ο(n 3 ), with c = 1 and n 0 = 2.

Equivalently, we may also define f is of order g as follows:

If f(n) and g(n) are functions defined on the positive integers, then f(n) is Ο(g(n)) if and only if there is a c > 0 and an n 0 > 0 such that

| f(n) | ≤ | g(n) | for all n n 0

Historical Note: The notation was introduced in 1892 by the German mathematician Paul Bachman.

Ω-Notation (Lower Bound)

This notation gives a lower bound for a function to within a constant factor. We write f(n) = Ω(g(n)) if there are positive constants n 0 and c such that to the right of n 0 , the value of f(n) always lies on or above c g(n).

In the set notation, we write as follows: For a given function g(n), the set of functions

Ω(g(n)) = {f(n) : there exist positive constants c and n 0 such that 0 ≤ c g(n) ≤ f(n) for all n n 0 }

We say that the function g(n) is an asymptotic lower bound for the function f(n).

The intuition behind Ω-notation is shown above.

Example: √n = (lg n), with c = 1 and n 0 = 16.

• 1.4.1 Algorithm Analysis

The complexity of an algorithm is a function g(n) that gives the upper bound of the number of operation (or running time) performed by an algorithm when the input size isn.

There are two interpretations of upper bound.

Worst-case Complexity The running time for any given size input will be lower than the upper bound except possibly for some values of the input where the maximum is reached.

Average-case Complexity The running time for any given size input will be the average number of operations over all problem instances for a given size.

Because, it is quite difficult to estimate the statistical behavior of the input, most of the time we content ourselves to a worst case behavior. Most of the time, the complexity of g(n) is approximated by its family o(f(n)) where f(n) is one of the following functions. n (linear complexity), log n (logarithmic complexity), n a where a ≥ 2 (polynomial complexity), a n (exponential complexity).

1.4.2 Optimality

Once the complexity of an algorithm has been estimated, the question arises whether this algorithm is optimal. An algorithm for a given problem is optimal if its complexity reaches the lower bound over all the algorithms solving this problem. For example, any algorithm solving “the intersection of n segments” problem will execute at least n 2 operations in the worst case even if it does nothing but print the output. This is abbreviated by saying that the problem has Ω(n 2 ) complexity. If one finds an O(n 2 ) algorithm that solve this problem, it will be optimal and of complexity Θ(n 2 ).

1.5 Practical Complexities

Computational complexity theory is a branch of the theory of computation in theoretical computer science and mathematics that focuses on classifying computational problems according to their inherent difficulty. In this context, a computational problem is understood to be a task that is in principle amenable to being solved by a computer (which basically means that the problem can be stated by a set of mathematical instructions). Informally, a computational problem consists of problem instances and solutions to these problem instances. For example, primality testing is the problem of determining whether a given number is prime or not. The instances of this problem are natural numbers, and the solution to an instance is yes or no based on whether the number is prime or not.

A problem is regarded as inherently difficult if its solution requires significant resources, whatever the algorithm used. The theory formalizes this intuition, by introducing mathematical models of computation to study these problems and quantifying the amount of resources needed to solve them, such as time and storage. Other complexity measures are also used, such as the amount of communication (used in communication complexity), the number of gates in a circuit (used in circuit complexity) and the number of processors (used in parallel computing). One of the roles of computational complexity theory is to determine the practical limits on what computers can and cannot do.

Closely related fields in theoretical computer science are analysis of algorithms and computability theory. A key distinction between analysis of algorithms and computational complexity theory is that the former is devoted to analyzing the amount of resources needed by a particular algorithm to solve a problem, whereas the latter asks a more general question about all possible algorithms that could be used to solve the same problem. More precisely, it tries to classify problems that can or cannot be solved with appropriately restricted resources. In turn, imposing restrictions on the available resources is what distinguishes computational complexity from computability theory: the latter theory asks what kind of problems can, in principle, be solved algorithmically.

• 1.5.1 Function problems

A function problem is a computational problem where a single output (of a total function) is expected for every input, but the output is more complex than that of a decision problem, that is, it isn't just yes or no. Notable examples include the traveling salesman problemand the integer factorization problem.

It is tempting to think that the notion of function problems is much richer than the notion of decision problems. However, this is not really the case, since function problems can be recast as decision problems. For example, the multiplication of two integers can be expressed as the set of triples (a, b, c) such that the relation a × b = c holds. Deciding whether a given triple is member of this set corresponds to solving the problem of multiplying two numbers.

• 1.5.2 Measuring the size of an instance

To measure the difficulty of solving a computational problem, one may wish to see how much time the best algorithm requires to solve the problem. However, the running time may, in general, depend on the instance. In particular, larger instances will require more time to solve. Thus the time required to solve a problem (or the space required, or any measure of complexity) is calculated as function of the size of the instance. This is usually taken to be the size of the input in bits. Complexity theory is interested in how algorithms scale with an increase in the input size. For instance, in the problem of finding whether a graph is connected, how much more time does it take to solve a problem for a graph with 2n vertices compared to the time taken for a graph with n vertices?

If the input size is n, the time taken can be expressed as a function of n. Since the time

taken

on different

inputs

of the

same

size can

be different,

the worst-case time

complexity T(n) is defined to be the maximum time taken over all inputs of size n. If T(n)

is a polynomial in n, then the algorithm is said to be a polynomial time algorithm. Cobham's thesis says that a problem can be solved with a feasible amount

of resources if it admits a polynomial time algorithm.

• 1.6 Performance measurement of simple algorithms

1.Find the time complexity of the following algorithms

a.)Algorithm :simple

for (i=0; i<=n*n; i++)

for (j=i+1; j<i; j++)

if(i<j){

sum++;}

Sol

for (i=1; i<=n*n; i++) Executed n*n times

for (j=0; j<i; j++) Executed <= n*n times

sum++; O(1)

Running Time: O(n4)

2. Algorithm for matrix multiplication

Algorithm matmul(a[0…n-1],b[0 n-1])

..

//Input : two nby n matrices

//output: Matrix c=ab

For i0 to n-1 do

For j0 to n-1 do

C[I,j]0

For k0 to n-1 do

C[I,j]c[I,j]+a[I,k]+b[k,j]

Return c

n

1

Time complexity of this algorithm is given by M(n)=

i = 0

n

1

n

1

1 =n 3

j = 0

k = 0

4.Algorithm for element uniqueness

Algorithm : uniquelement(a[].n)

//input : n – number of elements and a- an array consisting of n elements

For i0 to n-2

do

For ii+1 to n-1

do

If(a[i]=a[j])

Return 0;

End if

Return 1;

Worst case efficiency :

n

2

n

1

1

= 0

i

j

= + 1

i

=

n

(

n

1) / 2

<=

n

2

T(n)€O(n 2)

Best case efficiency: if a[0]=a[1] then basic operation will be executed only once.

Therefore t(n)€Ω(1)

Note: To solve the non-recursive algorithms time efficiency make use of the formula

result=upper bound-lower bound+1 in each summation.

• 1. Explain the concept space complexity

• 2. What is meant by a time complexity? Why it is required?

• 3. write a note on asymptotic notations.

• 4. Find the time complexity of matrix multiplication algorithm.

SUMMARY:

Space complexity: The space complexity of a program is the amount of memory that may

be required to run a program.

Time complexity : It is the time required to execute a program.

Asymptotic notations – representation of time complexity in any of the notaions (big

oh,big omega, big theta).

1.5 KEYWORDS

Basic operation – an operation which is executed more number of times in the program

(logic part). Usually present in the inner most loop (part) of the algorithm/program.

• 1. 1.2

• 2. 1.3

• 3. 1.4

• 4. 1.6

• 1.7 UNIT-END EXERCISES AND ANSWERS

• 1. Find the time complexity of for the algorithm transpose of a matrix

• 2. Write a note on best case, average case , worst case in a program with example

• 1. 1.4

• 2. 1.3

• 1. Inroduction to The design and analysis of algorithms by Anany Levitin

• 2. Analysis and design of algorithms with C/C++ - 3 rd edition by Prof. Nandagopalan

• 3. . Analysis and design of algorithms by Padma reddy

• 4. Even, Shimon., "Graph Algorithms",Computer Science Press.

MODULE-1,UNIT

3

RECURRENCES

ALGORITHMS

ANALYSIS

AND

SOLVING

Structure

• 1.0 Objectives

• 1.1 Analyzing control structures

• 1.2 Using a barometer

• 1.3 Average case analysis

• 1.4 Amortized analysis

• 1.5 Solving recurrences

• 1.6 Summary

• 1.7 Key words

• 1.9 Unit-end exercises and answers

 2.0 OBJECTIVES At the end of this unit you will be able to ∑ Solve container loading and knapsack problems. ∑ Find the shortest paths using Prim’s and Kruskal’s algorithm. ∑ Identifying the difference between graph tree and minimum spanning tree. 2.1 INTRODUCTION An essential tool to design an efficient and suitable algorithm is the "Analysis of Algorithms". There is no magic formula it is simply a matter of judgment, intuition and experience. Nevertheless, theses are some basic techniques that are often useful, such as knowing how to deal with control structures and recursive equations. 1.2 Analyzing control structures

Control Structures Analysis: Eventually analysis of algorithms proceeds from the inside

out. Determine fist, the time required by individual instructions, then combine

these tines according to the control systems that combine the instructions in the

program.

• 1.2.1. Sequencing:

Let P1 and P2 be two fragments of an algorithm. They way be single instruction. They

may be single instructions or complicated sub-algorithms. Let t1 and t2 be the times

taken by P1 and P2. t1 and t2 may depend on various parameters, such as the instance

size. The sequencing rule says that the time required to compute “P1 and P2” is simply t1

+ t2. By the maximum rule, this time is in q(max(t1, t2)). Despite its simplicity, applying

this rule is sometimes less obvious than it may appear. It could happen that one of the

parameters that control t2 depends on the results of the computation performed P1.

• 1.2.2. “For” Loops:

They are the easiest loops to analyze. For i 1 to m do P(i) By a convention we’ll adopt:

m=0 means P(i) is not executed at all, (not an error). P(i) could depend on the size. Of

course, the easiest case is when it doesn’t. Let t be the time required to compute P(i) and

the total time required is l=mt. Usually this approach is adequate, but there is a potential

pitfall: We didn’t consider the time for the “loop control”. After all our for loops is

shorthand for something like the following while loop.

 i  1 while i <= m do P(i) i i + 1 In the worst situations it is reasonable to count the test i £ m at unit cost and the same thing with the instructions i ¬i + 1 and the sequencing operations “go to” implicit in the while loop. Let “c” be the upper bound on the time required by each of the operations: l <= c for i ¬1 + (m+1)c tests i £ m + mt execution of P(i) + mc execution of i i + 1 + mc sequencing operations l <= (t+3c)m+2c

This time is clearly bounded below by mt. If c is negligible compared to t, our previous

estimate that l is roughly equal to mt is justified. The analysis of for loop is more

interesting when the time t(i) required for P(i) runs as a function of I and or also on size

n.

So: for i 1 to m do P(i) takes a time given by,

the loop control).

t(i) (ignoring the time taken by

Example : Algorithm for matrix multiplication

Algorithm matmul(a[0…n-1],b[0 n-1])

..

//Input : two nby n matrices

//output: Matrix c=ab

For i0 to n-1 do

For j0 to n-1 do

C[I,j]0

For k0 to n-1 do

C[I,j]c[I,j]+a[I,k]+b[k,j]

Return c

 n 1 n 1 n 1 Time complexity of this algorithm is given by M(n)= i = 0 j = 0 k = 0
• 1 €θ(n 3)

• 1.2.3 Recursive Calls:

The analysis of recursive algorithms is straight forward to a certain point. Simple

inspection of the algorithm often gives rise to a recurrence equation that “mimics” the

flow of control in the algorithm. General techniques on how to solve or how to transform

the equation into simpler non-recursive equations will be seen later.

• 1.2.4 “While” and “Repeat” loops:

Theses two types of loops are usually harder to analyze than “for” loops because there is

no obvious a priori way to know how many times we shall have to go around the loop.

The standard technique for analyzing these loops is to find a function of the variables

involved where value decreases each time around. To determine how many times the

loop is repeated, however, we need to understand better how the value of this function

decreases. An alternative approach to the analysis of “while” loops consist of treating

them like recursive algorithms. We illustrate both techniques with the same example, The

analysis of “repeat” loops is carried out similarly.

• 1.3 Using a barometer

• 1.4 Supplementary examples

a. Algorithm to find the sum os array elements

Algorithm sum(a,n)

 { s=0.0; for I=1 to n do s= s+a[I]; return s; }

The problem instances for this algorithm are characterized by n,the

number of elements to be summed. The space needed d by ‘n’ is one

word, since it is of type integer.

 ∑ The space needed by ‘a’a is the space needed by variables of type array of floating point numbers. ∑ This is atleast ‘n’ words, since ‘a’ must be large enough to hold the ‘n’ elements to be summed. ∑ So,we obtain S sum(n)>=(n+s)

[ n for a[],one each for n,I a& s]

Time Complexity:

The time T(p) taken by a program P is the sum of the compile time

and the run time(execution time)

The compile time does not depend on the instance characteristics. Also we

may assume that a compiled program will be run several times without

recompilation .This rum time is denoted by tp(instance characteristics).

The number of steps any problem statemn t is assigned depends on the kind

of statement.

 For example, comments  0 steps. Assignment statements  1 steps.

[Which does not involve any calls to other algorithms]

Interactive statement such as for, while & repeat-untilControl part of the

statement.

• 1. We introduce a variable, count into the program statement to increment count with initial value 0.Statement to increment count by the appropriate amount are introduced into the program.

This is done so that each time a statement in the original program is

executes count is incremented by the step count of that statement.

Algorithm:

Algorithm sum(a,n)

{

s= 0.0;

count = count+1;

for I=1 to n do

{

count =count+1;

s=s+a[I];

count=count+1;

}

count=count+1;

count=count+1;

return s;

}

If the count is zero to start with, then it will be 2n+3 on termination. So each

invocation of sum execute a total of 2n+3 steps.

The second method to determine the step count of an algorithm is to build a

table in which we list the total number of steps contributes by each statement.

First determine the number of steps per execution (s/e) of the statement and

the total number of times (ie., frequency) each statement is executed.

By combining these two quantities, the total contribution of all statements, the

step count for the entire algorithm is obtained.

 Statement S/e Frequency Total 1. Algorithm Sum(a,n) 0 - 0 2.{ 0 - 0 3. S=0.0; 1 1 1 4. for I=1 to n do 1 n+1 n+1 5. s=s+a[I]; 1 n n 6. return s; 1 1 1 7. } 0 - 0 Total 2n+3

1.5 AVERAGE –CASE ANALYSIS

 ∑ Most of the time, average-case analysis are performed under the more or less realistic assumption that all instances of any given size are equally likely. ∑ For sorting problems, it is simple to assume also that all the elements to be sorted are distinct. ∑ Suppose we have ‘n’ distinct elements to sort by insertion and all n!

permutation of these elements are equally likely.

 ∑ To determine the time taken on a average by the algorithm ,we could add the times required to sort each of the possible permutations ,and then divide by n! the answer thus obtained. ∑ An alternative approach, easier in this case is to analyze directly the time required by the algorithm, reasoning probabilistically as we proceed. ∑ For any I,2 I n, consider the sub array, T[1….i]. ∑ The partial rank of T[I] is defined as the position it would occupy if the sub array were sorted. ∑ For Example, the partial rank of T[4] in [3,6,2,5,1,7,4] in 3 because T[1….4] once sorted is [2,3,5,6]. ∑ Clearly the partial rank of T[I] does not depend on the order of the element in ∑ Sub array T[1…I-1].

Analysis

Best case: This analysis constrains on the input, other than size. Resulting in the fasters

possible run time

Worst case:This analysis constrains on the input, other than size. Resulting in the fasters

possible run time

Average case: This type of analysis results in average running time over every type of

 input. Complexity: Complexity refers to the rate at which the storage time grows as a function of the problem size

Asymptotic analysis:

Expressing the complexity in term of its relationship to know

function. This type analysis is called asymptotic analysis.

Asymptotic notation:

Big ‘oh’: the function f(n)=O(g(n)) iff there exist positive constants c and no such that

f(n)≤c*g(n) for all n, n ≥ no.

Omega: the function f(n)=Ω(g(n)) iff there exist positive constants c and no such that

f(n) ≥ c*g(n) for all n, n ≥ no.

Theta: the function f(n)=ө(g(n)) iff there exist positive constants c1,c2 and no such that

c1 g(n) ≤ f(n) ≤ c2 g(n) for all n, n ≥ no.

1.6 Amortized anlysis

In computer science, amortized analysis is a method of analyzing algorithms that

considers the entire sequence of operations of the program. It allows for the establishment

of a worst-case bound for the performance of an algorithm irrespective of the inputs by

looking at all of the operations. At the heart of the method is the idea that while certain

operations may be extremely costly in resources, they cannot occur at a high-enough

frequency to weigh down the entire program because the number of less costly operations

will far outnumber the costly ones in the long run, "paying back" the program over a

number of iterations. It is particularly useful because it guarantees worst-case

performance while accounting for the entire set of operations in an algorithm.

There are generally three methods for performing amortized analysis: the aggregate

method, the accounting method, and the potential method. All of these give the same

answers, and their usage difference is primarily circumstantial and due to individual

preference.

Aggregate

analysis determines the

upper bound T(n)

on

the

total

cost

of

a

sequence of n operations, then calculates the average cost to be T(n) / n.

The accounting method determines the individual cost of each operation,

combining its immediate execution time and its influence on the running time of

future operations. Usually, many short-running operations accumulate a "debt" of

unfavorable state in small increments, while rare long-running operations decrease it

drastically.

The potential method is like the accounting method, but overcharges operations

early to compensate for undercharges later.

As a simple example, in a specific implementation of the dynamic array, we double the size of the array each time it fills up. Because of this, array reallocation may be required, and in the worst case an insertion may require O(n). However, a sequence of n insertions can always be done in O(n) time, because the rest of the insertions are done in constant time, so n insertions can be completed in O(n) time. The amortizedtime per operation is therefore O(n) / n = O(1).

Another way to see this is to think of a sequence of n operations. There are 2 possible operations: a regular insertion which requires a constant c time to perform (assume c = 1), and an array doubling which requires O(j) time (where j<n and is the size of the array at the time of the doubling). Clearly the time to perform these operations is less than the time needed to perform n regular insertions in addition to the number of array doublings that would have taken place in the original sequence of n operations. There are only as many array doublings in the sequence as there are powers of 2 between 0 and n ( lg(n) ). Therefore the cost of a sequence of n operations is strictly less than the below expression:

accounting method determines the individual cost of each operation, combining its immediate execution time and its influence on the running time of future operations. Usually, many short-running operations accumulate a "debt" of unfavorable state in small increments, while rare long-running operations decrease it drastically.  The potential method is like the accounting method, but overcharges operations early to compensate for undercharges later. As a simple example, in a specific implementation of the dynamic array , we double the size of the array each time it fills up. Because of this, array reallocation may be required, and in the worst case an insertion may require O ( n ). However, a sequence of n insertions can always be done in O( n ) time, because the rest of the insertions are done in constant time, so n insertions can be completed in O( n ) time. The amortized time per operation is therefore O( n ) / n = O(1). Another way to see this is to think of a sequence of n operations. There are 2 possible operations: a regular insertion which requires a constant c time to perform (assume c = 1), and an array doubling which requires O(j) time (where j<n and is the size of the array at the time of the doubling). Clearly the time to perform these operations is less than the time needed to perform n regular insertions in addition to the number of array doublings that would have taken place in the original sequence of n operations. There are only as many array doublings in the sequence as there are powers of 2 between 0 and n ( lg(n) ). Therefore the cost of a sequence of n operations is strictly less than the below expression: The amortized time per operation is the worst-case time bound on a series of n operations divided by n. The amortized time per operation is therefore O( 3n ) / n = O( n ) / n = O(1). 1.7 Recursion: Recursion may have the following definitions: -The nested repetition of identical algorithm is recursion. -It is a technique of defining an object/process by itself. -Recursion is a process by which a function calls itself repeatedly until some specified condition has been satisfied. Recursion may have the following definitions: -The nested repetition of identical algorithm is recursion. -Recursion is a process by which a function calls itself repeatedly until some specified condition has been satisfied. 1.7.1 When to use recursion: 27 " id="pdf-obj-26-84" src="pdf-obj-26-84.jpg">

The amortized time per operation is the worst-case time bound on a series of n operations divided by n. The amortized time per operation is therefore

O(3n) / n = O(n) / n = O(1).

1.7 Recursion:

Recursion may have the following definitions:

-The nested repetition of identical algorithm is recursion.

-It is a technique of defining an object/process by itself.

-Recursion is a process by which a function calls itself repeatedly until some specified

condition has been satisfied.

Recursion may have the following definitions:

-The nested repetition of identical algorithm is recursion.

-Recursion is a process by which a function calls itself repeatedly until some specified

condition has been satisfied.

1.7.1 When to use recursion:

Recursion can be used for repetitive computations in which each action is stated

in terms of previous result. There are two conditions that must be satisfied by any

recursive procedure.

• 1. Each time a function calls itself it should get nearer to the solution.

• 2. There must be a decision criterion for stopping the process.

In making the decision about whether to write an algorithm in recursive or non-recursive

form, it is always advisable to consider a tree structure for the problem. If the structure is

simple then use non-recursive form. If the tree appears quite bushy, with little duplication

of tasks, then recursion is suitable.

The recursion algorithm for finding the factorial of a number is given below,

Algorithm : factorial-recursion

Input : n, the number whose factorial is to be found.

Output : f, the factorial of n

Method : if(n=0)

f=1

else

f=factorial(n-1) * n

if end

algorithm ends.

The general procedure for any recursive algorithm is as follows,

• 1. Save the parameters, local variables and return addresses.

• 2. If the termination criterion is reached perform final computation and goto step 3 otherwise perform final computations and goto step 1

• 3. Restore the most recently saved parameters, local variable and return address and goto the latest return address.

1.7.2

Iteration v/s Recursion:

Demerits of recursive algorithms:

• 1. Many programming languages do not support recursion; hence, recursive mathematical function is implemented using iterative methods.

• 2. Even though mathematical functions can be easily implemented using recursion it is always at the cost of execution time and memory space. For example, the recursion tree for generating 6 numbers in a Fibonacci series generation is given in fig 2.5. A Fibonacci series is of the form 0,1,1,2,3,5,8,13,…etc, where the third number is the sum of preceding two numbers and so on. It can be noticed from the fig 2.5 that, f(n-2) is computed twice, f(n-3) is computed thrice, f(n-4) is computed 5 times.

• 3. A recursive procedure can be called from within or outside itself and to ensure its proper functioning it has to save in some order the return addresses so that, a return to the proper location will result when the return to a calling statement is made.

• 4. The recursive programs needs considerably more storage and will take more time.

• 1.7.3 Demerits of iterative methods :

 ∑ Mathematical functions such as factorial and Fibonacci series generation can be easily implemented using recursion than iteration. ∑ In iterative techniques looping of statement is very much necessary.

Recursion is a top down approach to problem solving. It divides the problem into pieces

or selects out one key step, postponing the rest.

Iteration is more of a bottom up approach. It begins with what is known and from this

constructs the solution step by step. The iterative function obviously uses time that is

O(n) where as recursive function has an exponential time complexity.

It is always true that recursion can be replaced by iteration and stacks. It is also true that

stack can be replaced by a recursive program with no stack.

• 1.7.4 SOLVING RECURRENCES :-( Happen again (or) repeatedly)

 ∑ The indispensable last step when analyzing an algorithm is often to solve a recurrence equation. ∑ With a little experience and intention, most recurrence can be solved by intelligent guesswork. ∑ However, there exists a powerful technique that can be used to solve certain classes of recurrence almost automatically. ∑ This is a main topic of this section the technique of the characteristic equation.
• 1. Intelligent guess work:

This approach generally proceeds in 4 stages.

• 1. Calculate the first few values of the recurrence

• 2. Look for regularity.

• 3. Guess a suitable general form.

• 4. And finally prove by mathematical induction(perhaps constructive induction).

1) (Fibonacci) Consider the recurrence.

 n if n=0 or n=1 f n = f n-1 + f n-2 otherwise

We rewrite the recurrence as,

f n – f

n-1

f

n-2 = 0.

The characteristic polynomial is,

x

2 – x – 1 = 0.

The roots are,

-(-1) ± √((-1)2 + 4)

x = ------------------------

2

1 ±√ (1 + 4)

= ----------------

2

1 ± √5

= ----------

2

1 + √ 5

1 - √5

 r 1 = --------- and 2 The general solution is, f n = C 1 r 1 n + C 2 r 2 n when n=0, when n=1,

r 2 = ---------

2

f 0 = C 1 + C 2 = 0

f 1 = C 1 r 1 + C 2 r 2 = 1

 C 1 + C 2 = 0  (1) C 1 r 1 + C 2 r 2 = 1  (2) From equation (1) C 1 = -C 2 Substitute C 1 in equation(2) -C 2 r 1 + C 2 r 2 = 1 C 2 [r 2 – r 1 ] = 1 Substitute r 1 and r 2 values 1 - √5 1 - √5 C 2 --------- - --------- 2 2 1 – √5 – 1 – √5 C 2 --------------------- = 1 2 -C 2 * 2√5 -------------- = 1 2 – √5C 2 = 1 C 1 = 1/√5 C 2 = -1/√5 Thus, 1 1 + √5 n -1 1 - √5 f n = ---- --------- + ---- -------- √5 2 √5 2 1 1 + √5 n 1 – √5 = ---- --------- - ---------

= 1

n

n

√5

2

2

3. Inhomogeneous recurrence :

* The solution of a linear recurrence with constant co-efficient becomes more difficult

when the recurrence is not homogeneous, that is when the linear combination is not

equal to zero.

* Consider the following recurrence

a 0 t n + a 1 t n-1 + … + a k t n-k = b n p(n)

* The left hand side is the same as before,(homogeneous) but on the right-hand side

we have b n p(n), where,

b is a constant

p(n) is a polynomial in ‘n’ of degree ‘d’.

Example(1) :

Consider the recurrence,

t n – 2t n-1 = 3 n

(A)

In this case, b=3, p(n) = 1, degree = 0.

The characteristic polynomial is,

(x – 2)(x – 3) = 0

The roots are, r 1 = 2, r 2 = 3

The general solution,

 t n = C 1 r 1 n + C 2 r 2 n t n = C 1 2 n + C 2 3 n  (1) when n=0, C 1 + C 2 = t0  (2) when n=1, 2C 1 + 3C 2 = t1  (3)

sub n=1 in eqn (A)

t 1 – 2t 0 = 3

t 1 = 3 + 2t 0

substitute t 1 in eqn(3),

(2) * 2

2C 1 + 2C 2 = 2t 0

2C 1 + 3C 2 = (3 + 2t 0 )

-------------------------------

-C 2 = -3

C 2 = 3

Sub C 2 = 3 in eqn (2)

C 1 + C 2 = t 0

C 1 + 3 = t 0

C 1 = t 0 – 3

Therefore t n = (t 0 -3)2 n + 3. 3 n

= Max[O[(t 0 – 3) 2 n ], O[3.3 n ]]

= Max[O(2 n ), O(3 n )] constants

= O[3 n ]

Example :2. Solve the following recurrence relation

x(n)=x(n-1)+5

for n>1 , x(1)=0

Sol : The above recurrence relation can be written as shown below

 x(n)={ x(n-1)+5 if n>1 0 if n=1}

consider the relation when n>1

x(n)=x(n-1)+5 -------a

Replace n by n-1 in eqv a

x(n)=x(n-2)+5+5

Replace n by n-2 in eqv a

x(n)= x(n-2)+5+5+5

x(n)= x(n-2)+3*5

………………….

………………… ..

Finally x(n)=x[n-(n-1)]+(n-1)*5

= x(1)+(n-1)*5

=0 +(n-1)*5

x(n)=5(n-1)

• 4. Change of variables:

* It is sometimes possible to solve more complicated recurrences by making a

change of variable.

* In the following example, we write T(n) for the term of a general recurrences,

and t i for the term of a new recurrence obtained from the first by a change of

variable.

Example: (1)

Consider the recurrence,

T(n) =

1

,

if n=1

3T(n/2) + n

, if ‘n’ is a power of 2, n>1

Reconsider the recurrence we solved by intelligent guesswork in the previous

section, but only for the case when ‘n’ is a power of 2

33

1

T(n) =

3T(n/2) + n

* We replace ‘n’ by 2 i .

* This is achieved by introducing new recurrence t i , define by t i = T(2 i )

* This transformation is useful because n/2 becomes (2 i )/2 = 2 i-1

* In other words, our original recurrence in which T(n) is defined as a function of

T(n/2) given way to one in which t i is defined as a function of t i-1 , precisely

the type of recurrence we have learned to solve.

t i = T(2 i ) = 3T(2 i-1 ) + 2 i

t i =

3t

i-1 +

2

i

t i – 3t

i-1 =

2

i

(A)

In this case,

b = 2, p(n) = 1, degree = 0

So, the characteristic equation,

(x – 3)(x – 2) = 0

The roots are, r1 = 3, r2 = 2.

The general equation,

t n = C 1 r 1 i + C 2 r 2

i

 sub. r 1 & r 2 : t n = 3 n C 1 + C 2 2 n t n = C 1 3 i + C 2 2 i

We use the fact that, T(2 i ) = t i & thus T(n) = tlogn when n= 2 i to obtain,

T(n) = C 1 . 3 log 2 n + C 2 . 2 log 2

n

T(n) = C 1 . n log 2 3 + C 2 .n

[i = logn]

When ‘n’ is a power of 2, which is sufficient to conclude that,

T(n) =

O(n log3 ) ‘n’ is a power of 2

1. Explain how to analyze different control structures of algorithms.

2. Write a note on average case anlysis.

3. Write a recursive algorithm for generating Fibonacci series and construct the

recurrence relation and solve.

4. Solve the following recurrence relation x(n)=x(n-1)+5 for n>1 , x(1)=0.

SUMMARY:

Control Structures Analysis: Eventually analysis of algorithms proceeds from the inside

out. Determine fist, the time required by individual instructions, then combine

these tines according to the control systems that combine the instructions in the

program.

Recursion can be used for repetitive computations in which each action is stated in terms

of previous result.

Solving recurrences by using following steps : Calculate the first few values of the

recurrence.Look for regularity.Guess a suitable general form and finally prove by

mathematical induction(perhaps constructive induction).

1.7 KEYWORDS

• 1. Big ‘oh’ , Omega , and Theta symbols of asymptotic notation.

• 1. 1.1

• 2. 1.5

• 3. 1.7

• 4. 1.7

• 1.7 UNIT-END EXERCISES AND ANSWERS

• 3. Write a note on amortized anlysis.

• 4. solve the recurrence relation x(n)=x(n-1)*n

if n>0

where x(0)=1

• 1. 1.6

• 2. 1.7

• 1. Inroduction to The design and analysis of algorithms by Anany Levitin

• 2. Analysis and design of algorithms with C/C++ - 3 rd edition by Prof. Nandagopalan

• 3. . Analysis and design of algorithms by Padma reddy

• 4. Even, Shimon., "Graph Algorithms",Computer Science Press.

MODULE-1,UNIT 4 SEARCHING AND SORTING

Structure

• 1.0 Objectives

• 1.1 Searching algorithms

 ∑ Linear search ∑ Binary search 1.2 sorting ∑ Selection sort ∑ Insertion sort ∑ Bubble sort 1.3 Summary 1.4 Keywords

• 1.6 Unit- end exercises and answers

3.0

OBJECTIVES

At the end of this unit you will be able to

 ∑ Know how to search in different ways ∑ Identify which searching technique is better ∑ Make sorting in different ways example insertion sort, selection sort ∑ Performance measurement of searching and sorting techniques.
• 3.1 SEARCHING ALGORITHMS

Let us assume that we have a sequential file and we wish to retrieve an element matching

with key ‘k’, then, we have to search the entire file from the beginning till the end to

check whether the element matching k is present in the file or not.

There are a number of complex searching algorithms to serve the purpose of searching.

The linear search and binary search methods are relatively straight forward methods of

searching.

• 1.1.1 Sequential search: (Linear search)

In this method, we start to search from the beginning of the list and examine each

element till the end of the list. If the desired element is found we stop the search and

return the index of that element. If the item is not found and the list is exhausted the

search returns a zero value.

In the worst case the item is not found or the search item is the last (n th ) element.

For both situations we must examine all n elements of the array so the order of magnitude

or complexity of the sequential search is n. i.e., O(n). The execution time for this

algorithm is proportional to n that is the algorithm executes in linear time.

The algorithm for sequential search is as follows,

Algorithm : sequential search

Input : A, vector of n elements K, search element

Output : j –index of k

i=1

While(i<=n)

{

if(A[i]=k)

{

write("search successful")

write(k is at location i)

exit();

}

else

i++

if end

while end

write (search unsuccessful);

algorithm ends.

1.1.2 Binary search:

Binary search method is also relatively simple method. For this method it is

necessary to have the vector in an alphabetical or numerically increasing order. A search

for a particular item with X resembles the search for a word in the dictionary. The

approximate mid entry is located and its key value is examined. If the mid value is

greater than X, then the list is chopped off at the (mid-1) th location. Now the list gets

reduced to half the original list. The middle entry of the left-reduced list is examined in a

similar manner. This procedure is repeated until the item is found or the list has no more

elements. On the other hand, if the mid value is lesser than X, then the list is chopped off

at (mid+1) th location. The middle entry of the right-reduced list is examined and the

procedure is continued until desired key is found or the search interval is exhausted.

The algorithm for binary search is as follows,

Algorithm : binary search

Input : A, vector of n elements K, search element

Output : low –index of k

low=1,high=n

While(low<=high-1)

{

mid=(low+high)/2

if(k<a[mid])

high=mid

else

low=mid

if end

}

while end

if(k=A[low])

{

write("search successful")

write(k is at location low)

exit();

}

else

write (search unsuccessful);

if end;

algorithm ends.

1.2 Sorting

Several algorithms are presented, including insertion sort, shell sort, and quicksort.

Sorting by insertion is the simplest method, and doesn’t require any additional storage.

Shell sort is a simple modification that improves performance significantly.

1.2.1 SELECTION_SORT

Selection sort is among the simplest of sorting techniques and it work very well for small

files. Furthermore, despite its evident "naïve approach "Selection sort has a quite

important application because each item is actually moved at most once, Section sort is a

method of choice for sorting files with very large objects (records) and small keys.

Here's a step-by-step example to illustrate the selection sort algorithm using numbers:

Original array:

6 3 5 4 9 2 7

1st pass -> 2 3 5 4 9 6 7 (2 and 6 were swapped)

2nd pass -> 2 3 4 5 9 6 7 (4 and 5 were swapped)

3rd pass -> 2 3 4 5 6 9 7 (6 and 9 were swapped)

4th pass -> 2 3 4 5 6 7 9 (7 and 9 were swapped)

5th pass -> 2 3 4 5 6 7 9 (no swap)

6th pass -> 2 3 4 5 6 7 9 (no swap)

Note: There were 7 keys in the list and thus 6 passes were required. However, only 4

swaps took place.

Algorithm : Selection sort

for i ← 1 to n-1 do

min j i;

min x ← A[i]

for j i + 1 to n do

If A[j] < min x then

min j j

min x ← A[j]

A[min j] ← A [i]

A[i] ← min x

The worst case occurs if the array is already sorted in descending order. Nonetheless,

the time require by selection sort algorithm is not very sensitive to the original order of

the array to be sorted: the test "if A[j] < min x" is executed exactly the same number of

times in every case. The variation in time is only due to the number of times the "then"

part (i.e., min j j; min x A[j] of this test are executed.

The Selection sort spends most of its time trying to find the minimum element in the

"unsorted" part of the array. It clearly shows the similarity between Selection sort and

Bubble sort. Bubble sort "selects" the maximum remaining elements at each stage, but

wastes some effort imparting some order to "unsorted" part of the array. Selection sort is

quadratic in both the worst and the average case, and requires no extra memory.

For each i from 1 to n - 1, there is one exchange and n - i comparisons, so there is a total

of n -1 exchanges and (n -1) + (n -2) +

. . .

+ 2 + 1 =n(n -1)/2 comparisons. These

observations hold no matter what the input data is. In the worst case, this could be

quadratic, but in the average case, this quantity is O(n log n).

1.2.2 Insertion Sort

If the first few objects are already sorted, an unsorted object can be inserted in the sorted

set in proper place. This is called insertion sort. An algorithm consider the elements one

at a time, inserting each in its suitable place among those already considered (keeping

them sorted). Insertion sort is an example of an incremental algorithm; it builds the

sorted sequence one number at a time. This is perhaps the simplest example of the

incremental insertion technique, where we build up a complicated structure on n items by

first building it on n − 1 items and then making the necessary changes to fix things in

adding the last item. The given sequences are typically stored in arrays. We also refer the

numbers as keys. Along with each key may be additional information, known as satellite

data. [Note that "satellite data" does not necessarily come from satellite!]

Algorithm: Insertion Sort

It works the way you might sort a hand of playing cards:

• 1. We start with an empty left hand [sorted array] and the cards face down on the table [unsorted array].

• 2. Then remove one card [key] at a time from the table [unsorted array], and insert it into the correct position in the left hand [sorted array].

• 3. To find the correct position for the card, we compare it with each of the cards already in the hand, from right to left.

Note that at all times, the cards held in the left hand are sorted, and these cards were

originally the top cards of the pile on the table.

Pseudo code

We use a procedure INSERTION_SORT. It takes as parameters an array A[1

..

n] and the

length n of the array. The array A is sorted in place: the numbers are rearranged within

the array, with at most a constant number outside the array at any time.

INSERTION_SORT (A)

• 1. FOR j ← 2 TO length[A]

DO

 2. key ← A[j] 3. {Put A[j] into the sorted sequence A[1 j − 1]} 4. i ← j − 1
• 5. WHILE i > 0 and A[i] > key

• 6. DO A[i +1] ← A[i]

• 7. i i − 1

• 8. A[i + 1] ← key

Example: Following figure (from CLRS) shows the operation of INSERTION-SORT on

the array A= (5, 2, 4, 6, 1, 3). Each part shows what happens for a particular iteration

with the value of j indicated. j indexes the "current card" being inserted into the hand.

Read the figure row by row. Elements to the left of A[j] that are greater than A[j] move

one position to the right, and A[j] moves into the evacuated position.

Analysis

Since the running time of an algorithm on a particular input is the number of steps

executed, we must define "step" independent of machine. We say that a statement that

takes c i steps to execute and executed n times contributes c i n to the total running time of

the algorithm. To compute the running time, T(n), we sum the products of the cost and

times column [see CLRS page 26]. That is, the running time of the algorithm is the sum

of running times for each statement executed. So, we have

T(n) = c 1 n + c 2 (n − 1) + 0 (n − 1) + c 4 (n − 1) + c 5 2 j n ( t j )

+ c 6 2 j n (t j − 1) + c 7 2 j n (t j − 1) + c 8 (n − 1)

In the above equation we supposed that t j be the number of times the while-loop (in line

5) is executed for that value of j. Note that the value of j runs from 2 to (n − 1). We have

T(n) = c 1 n + c 2 (n − 1) + c 4 (n − 1) + c 5 2 j n ( t j )+ c 6 2 j n (t j

− 1)

+

c 7

2 j n (t j

− 1) + c 8 (n − 1)

Equation (1)

Best-Case

The best case occurs if the array is already sorted. For each j = 2, 3,

...

, n, we find

that A[i] less than or equal to the key when i has its initial value of (j − 1). In other words,

when i = j −1, always find the key A[i] upon the first time the WHILE loop is run.

Therefore, t j = 1 for j = 2, 3,

equation (1) as follows:

...

,

n and the best-case running time can be computed using

T(n) = c 1 n + c 2 (n − 1) + c 4 (n − 1) + c 5 2 j n (1) + c 6 2 j n (1 − 1) + c 7 2 j n (1

− 1) + c 8 (n − 1)

T(n) = c 1 n + c 2 (n − 1) + c 4 (n − 1) + c 5 (n − 1) + c 8 (n − 1)

T(n) = (c 1 + c 2 + c 4

+ c 5

+ c 8 ) n + (c 2

+ c 4

+ c 5

+ c 8 )

This running time can be expressed as an + b for constants a and b that depend on the

statement costs c i . Therefore, T(n) it is a linear function of n.

The punch line here is that the while-loop in line 5 executed only once for each j. This

happens if given array A is already sorted.

It is a linear function of n.

T(n) = an + b = O(n)

Worst-Case

The worst-case occurs if the array is sorted in reverse order i.e., in decreasing order. In

the reverse order, we always find that A[i] is greater than the key in the while-loop test.

So, we must compare each element A[j] with each element in the entire sorted

subarray A[1

..

j − 1] and so t j = j for j = 2, 3,

...

,

n. Equivalently, we can say that since the

while-loop exits because i reaches to 0, there is one additional test after (