Вы находитесь на странице: 1из 47

CSE 830:

Design and Theory of Algorithms

Dr. Eric Torng


TA: Carl Bussema
Many slides adapted from those used by Dr. Charles Ofria

Outline
Definitions
Algorithms
Problems

Course Objectives
Administrative stuff
Analysis of Algorithms

What is an Algorithm?
Algorithms are the ideas behind computer programs.
An algorithm is the thing that stays the same whether the
program is in C++ running on a Cray in New York or is in
BASIC running on a Macintosh in Alaska!
To be interesting, an algorithm has to solve a general, specified
problem.

What is a problem?
Definition
A mapping/relation between a set of input instances
(domain) and an output set (range)

Problem Specification
Specify what a typical input instance is
Specify what the output should be in terms of the input
instance

Example: Sorting
Input: A sequence of N numbers a1an
Output: the permutation (reordering) of the input
sequence such that a1 a2 an .

Types of Problems
Search: find X in the input satisfying property Y
Structuring: Transform input X to satisfy property Y
Construction: Build X satisfying Y
Optimization: Find the best X satisfying property Y
Decision: Does X satisfy Y?
Adaptive: Maintain property Y over time.

Two desired properties of algorithms


Correctness
Always provides correct output when presented
with legal input

Efficiency
Computes correct output quickly given input

Correctness
Example: Traveling Salesperson Problem (TSP)
Input: A sequence of N cities with the distances dij
between each pair of cities
Output: a permutation (ordering) of the cities <c1, , cn>
that minimizes the expression
j =1 to n-1 dj,j+1 + dn,1

Which of the following algorithms is correct?


Nearest neighbor: Initialize tour to city 1. Extend tour by visiting
nearest unvisited city. Finally return to city 1.
All tours: Try all possible orderings of the points selecting the
ordering that minimizes the total length:

Efficiency

Example: Odd Number Problem


Input: A number n
Output: Yes if n is odd, no if n is even
Which of the following algorithms is most efficient?
Count up to that number from one and alternate naming each
number as odd or even.
Factor the number and see if there are any twos in the
factorization.
Keep a lookup table of all numbers from 0 to the maximum
integer.
Look at the last bit (or digit) of the number.

Outline
Definitions
Algorithms
Problems

Course Objectives
Administrative stuff
Analysis of Algorithms

Course Objectives
1.
2.
3.
4.
5.

Details of classic algorithms


Methods for designing algorithms
Validate/verify algorithm correctness
Analyze algorithm efficiency
Prove (or at least indicate) no correct,
efficient algorithm exists for solving a
given problem
6. Writing clear algorithms and proofs

Classic Algorithms

Lots of wonderful algorithms have already


been developed
I expect you to learn most of this from
reading, though we will reinforce in
lecture

Algorithm design methods

Something of an art form


Cannot be fully automated
We will describe some general techniques
and try to illustrate when each is
appropriate

Algorithm correctness

Proving an algorithm generates correct


output for all inputs
One technique covered in textbook

Loop invariants

We will do some of this in the course, but


it is not emphasized as much as other
objectives

Analyzing algorithms

The process of determining how much


resources (time, space) are used by a given
algorithm
We want to be able to make quantitative
assessments about the value (goodness) of one
algorithm compared to another
We want to do this WITHOUT implementing
and running an executable version of an
algorithm

Proving hardness results

We believe that no correct and efficient


algorithm exists that solves many
problems such as TSP
We define a formal notion of a problem
being hard
We develop techniques for proving
hardness results

Clear Writing
Methods for Expressing Algorithms
Implementations
Pseudo-code
English

Writing clear and understandable proofs


My main concern is not the specific language used
but the clarity of your algorithm/proof

Outline
Definitions
Algorithms
Problems

Course Objectives
Administrative stuff
Analysis of Algorithms

Algorithm Analysis Overview

RAM model of computation


Concept of input size
Measuring complexity

Best-case, average-case, worst-case

Asymptotic analysis

Asymptotic notation

The RAM Model


RAM model represents a generic
implementation of the algorithm
Each simple operation (+, -, =, if, call) takes
exactly 1 step.
Loops and subroutine calls are not simple
operations, but depend upon the size of the
data and the contents of a subroutine. We do
not want sort to be a single step operation.
Each memory access takes exactly 1 step.

Input Size

In general, larger input instances require


more resources to process correctly
We standardize by defining a notion of
size for an input instance
Examples

What is the size of a sorting input instance?


What is the size of an Odd number input
instance?

Algorithm Analysis Overview

RAM model of computation


Concept of input size
Measuring complexity

Best-case, average-case, worst-case

Asymptotic analysis

Asymptotic notation

Measuring Complexity

The running time of an algorithm is the function


defined by the number of steps (or amount of
memory) required to solve input instances of size
n

F(1) = 3
F(2) = 5
F(3) = 7

F(n) = 2n+1

Problem: Inputs of the same size may require


different numbers of steps to solve

3 different analyses
The worst case running time of an algorithm is the function
defined by the maximum number of steps taken on any
instance of size n.
The best case running time of an algorithm is the function
defined by the minimum number of steps taken on any
instance of size n.
The average-case running time of an algorithm is the function
defined by an average number of steps taken on any instance
of size n.
Which of these is the best to use?

Average case analysis

Drawbacks

Based on a probability distribution of input instances

The distribution may not be appropriate


Provides little consolation if we have a worst-case input

More complicated to compute than worst case


running time

Worst case running time is often comparable to


average case running time (see next graph)

Counterexamples to above point:

Quicksort
simplex method for linear programming

Best, Worst, and Average Case

Worst case analysis

Typically much simpler to compute as we do not


need to average performance on many inputs

Instead, we need to find and understand an input that


causes worst case performance

Provides guarantee that is independent of any


assumptions about the input
Often reasonably close to average case running
time
The standard analysis performed

Algorithm Analysis Overview

RAM model of computation


Concept of input size
Measuring complexity

Best-case, average-case, worst-case

Asymptotic analysis

Asymptotic notation

Motivation for Asymptotic Analysis


An exact computation of worst-case
running time can be difficult
Function may have many terms:
4n2 - 3n log n + 17.5 n - 43 n + 75

An exact computation of worst-case


running time is unnecessary
Remember that we are already approximating
running time by using RAM model

Simplifications
Ignore constants
4n2 - 3n log n + 17.5 n - 43 n + 75 becomes
n2 n log n + n - n + 1

Asymptotic Efficiency
n2 n log n + n - n + 1 becomes n2

End Result: (n2)

Why ignore constants?


RAM model introduces errors in constants
Do all instructions take equal time?
Specific implementation (hardware, code
optimizations) can speed up an algorithm by constant
factors
We want to understand how effective an algorithm is
independent of these factors

Simplification of analysis
Much easier to analyze if we focus only on n2 rather
than worrying about 3.7 n2 or 3.9 n2

Asymptotic Analysis
We focus on the infinite set of large n
ignoring small values of n
Usually, an algorithm that is asymptotically
more efficient will be the best choice for all
but very small inputs.
0

Big Oh Notation
O(g(n)) =
{f(n) : there exist positive constants c and n0 such
that nn0, 0 f(n) c g(n) }
What are the roles of the two constants?
n0:
c:

0
n0

f(n) c g(n)

Set Notation Comment


O(g(n)) is a set of functions.
However, we will use one-way equalities like
n = O(n2)

This really means that function n belongs to the


set of functions O(n2)
Incorrect notation: O(n2) = n
Analogy
A dog is an animal but not an animal is a dog

Three Common Sets


f(n) = O(g(n)) means c g(n) is an Upper Bound on f(n)
f(n) = (g(n)) means c g(n) is a Lower Bound on f(n)
f(n) = (g(n)) means c1 g(n) is an Upper Bound on f(n)
and c2 g(n) is a Lower Bound on f(n)
These bounds hold for all inputs beyond some threshold n0.

O(g(n))

(g(n))

(g(n))

O(f(n)) and (g(n))


O( f (n))

1
100

( g (n))

1
25

n2

Example Function

f(n) = 3n - 100n + 6
2

Quick Questions
c
3n2 - 100n + 6 = O(n2)
3n2 - 100n + 6 = O(n3)
3n2 - 100n + 6 O(n)
3n2 - 100n + 6 = (n2)
3n2 - 100n + 6 (n3)
3n2 - 100n + 6 = (n)
3n2 - 100n + 6 = (n2)?
3n2 - 100n + 6 = (n3)?
3n2 - 100n + 6 = (n)?

n0

Common Complexity Functions


Complexity

10

20

110-5 sec

210-5 sec 310-5 sec

n2

0.0001 sec 0.0004 sec 0.0009 sec 0.016 sec

0.025 sec

0.036 sec

n3

0.001 sec

0.008 sec

0.027 sec

0.064 sec

0.125 sec

0.216 sec

n5

0.1 sec

3.2 sec

24.3 sec

1.7 min

5.2 min

13.0 min

2n

0.001sec

1.0 sec

17.9 min

12.7 days

35.7 years

366 cent

3n

0.59sec

58 min

6.5 years

3855 cent

2108cent

1.31013cent

log2 n

310-6 sec

410-6 sec 510-6 sec

510-6 sec

610-6 sec

610-6 sec

n log2 n 310-5 sec

30

40

50

410-5 sec

510-5 sec 610-5 sec

910-5 sec 0.0001 sec 0.0002 sec

60

0.0003 sec 0.0004 sec

Example Problems
1. What does it mean if:
f(n) O(g(n)) and g(n) O(f(n)) ?
2. Is 2n+1 = O(2n) ?
Is 22n = O(2n) ?
3. Does f(n) = O(f(n)) ?
4. If f(n) = O(g(n)) and g(n) = O(h(n)),
can we say f(n) = O(h(n)) ?

Extra Slides
Slides illustrating TSP algorithms
Case study with Insertion sort for Best,
Average, Worst case analysis

Possible Algorithm:
Nearest neighbor

Not Correct!

A Correct Algorithm
Try all possible orderings of the points selecting the
ordering that minimizes the total length:
d=
For each of the n! permutations, Pi of the n points,
if cost(Pi) < d then
d = cost(Pi)
Pmin = Pi
return Pmin

Case study: Insertion Sort


Count the number of times each line will be executed:
Num Exec.
for i = 2 to n
key = A[i]
j=i-1
while j > 0 AND A[j] > key
A[j+1] = A[j]
j = j -1
A[j+1] = key