Вы находитесь на странице: 1из 21

Space-time tradeoffs

For many problems some extra space really pays off


(extra space in tables - breathing room)
input enhancement
non comparison-based sorting
auxiliary tables (shift tables for pattern matching)

prestructuring
hashing
indexing schemes (eg, B-trees)

tables of information that do all the work


dynamic programming
Design and Analysis of Algorithms - Chapter 7

Sorting by Counting
Algorithm ComparisonCountingSort(A[0..n-1])
for i 0 to n-1 do Count[i] 0
for i 0 to n-2 do
for j i+1 to n-1 do
if A[i] <A[j] then Count[j] Count[j]+1
else Count[i] Count[i]+1
for i 0 to n-1 do S[Count[i]] A[i]
Example: 62 31 84 96 19 47
Efficiency
Design and Analysis of Algorithms - Chapter 7

Sorting by Counting (2)


Algorithm DistributionCountingSort(A[0..n-1])
for j 0 to u-l do D[j] 0
for i 0 to n-1 do D[A[i]-l] D[A[i]-l] + 1
for j 1 to u-l do D[j] D[j-1]+D[j]
for i n-1 down to 0 do
j A[i]-l; S[D[j]-1] A[i]; D[j] D[j]-1
Example: 13 11 12 13 12 12
Efficiency
Design and Analysis of Algorithms - Chapter 7

String matching

pattern: a string of m characters to search for


text: a (long) string of n characters to search in

Brute force algorithm:

1. Align pattern at beginning of text


2. moving from left to right, compare each character of
pattern to the corresponding character in text until
all characters are found to match (successful search); or
a mismatch is detected

3. while pattern is not found and the text is not yet exhausted,
realign pattern one position to the right and repeat step 2.

Design and Analysis of Algorithms - Chapter 7

String searching - History

1970: Cook shows (using finite-state machines) that problem


can be solved in time proportional to n+m
1976 Knuth and Pratt find algorithm based on Cooks idea;
Morris independently discovers same algorithm in attempt
to avoid backing up over text
At about the same time Boyer and Moore find an algorithm
that examines only a fraction of the text in most cases (by
comparing characters in pattern and text from right to left,
instead of left to right)
1980 Another algorithm proposed by Rabin and Karp
virtually always runs in time proportional to n+m and has
the advantage of extending easily to two-dimensional
pattern matching and being almost as simple as the bruteforce method.

Design and Analysis of Algorithms - Chapter 7

Horspools Algorithm

A simplified version of Boyer-Moore


algorithm that retains key insights:
compare pattern characters to text from
right to left
given a pattern, create a shift table that
determines how much to shift the pattern
when a mismatch occurs (input
enhancement)
Design and Analysis of Algorithms - Chapter 7

How far to shift?


Look at first (rightmost) character in text that was compared. Three cases:
The character is not in the pattern
.....c...................... (c not in pattern)
BAOBAB

The character is in the pattern (but not at rightmost position)


.....O...................... (O occurs once in pattern)
BAOBAB
.....A...................... (A occurs twice in pattern)
BAOBAB

The rightmost characters produced a match


.....B......................
BAOBAB

Shift Table: Stores number of characters to shift by depending on first


character compared

Design and Analysis of Algorithms - Chapter 7

Shift table

Constructed by scanning pattern before search begins

All entries are initialized to length of pattern.


For c occurring in pattern, update table entry to distance
of rightmost occurrence of c from end of pattern

Algorithm ShiftTable(P[0..m-1])
for i 0 to size-1 do Table[i] m
for j 0 to m-2 do Table[P[j]] m-1-j
return Table
Design and Analysis of Algorithms - Chapter 7

Shift table

Example for pattern BAOBAB:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6

Then:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

Design and Analysis of Algorithms - Chapter 7

The Algorithm

Horspool Matching(P[0..m-1,T[0..n-1]])
ShiftTable(P[0..m-1])
i m-1
while i<=n-1 do
k 0
while k<=m-1 and P[m-1-k]=T[i-k] do
k k+1
if k=m return i-m+1
else i i+Table[T[i]]
return -1
Design and Analysis of Algorithms - Chapter 7

10

Boyer-Moore algorithm

Based on same two ideas:


compare pattern characters to text from right to
left
given a pattern, create a shift table that
determines how much to shift the pattern when
a mismatch occurs (input enhancement)
Uses additional shift table with same idea applied
to the number of matched characters

Design and Analysis of Algorithms - Chapter 7

11

The bad-symbol shift

Based on the Horspool idea of using the extra table


However, this table is computed differently
If c, the text character corresponding to the last pattern
character, is not in the pattern, then shift in the same
way by m characters (actually c is a bad symbol)
If the mismatching character (the bad symbol) does not
appear in the pattern, then shift to overpass it
If the mismathcing character (the bad symbol) appears
in the pattern, then shift to align the bad symbol to the
same text character (lying to the left of the mismatching
position).
Design and Analysis of Algorithms - Chapter 7

12

The bad-symbol shift - example


The bad symbol IS NOT in the pattern
...SER......................
BARBER
BARBER
shift 4 positions
The bad symbol IS in the pattern
...AER......................
BARBER
BARBER shift 2 positions,
This shift is given by: d=max[t1(c)-k,1], where t1 is the
Horspool table, k the distance between the bad
symbol from the end of the pattern
Design and Analysis of Algorithms - Chapter 7

13

The good-suffix shift - example


What happens if a matched suffix appears
again in the pattern (eg. ABRACADABRA)
Important to find another suffix with a
different previous character. Calculate the
shift as the distance between two
occurrences of the suffix.
Also, important to find the longest prefix of
size l<k that matches the suffix of size l.
Calculate the shift as the distance between
the suffix and the prefix.

Design and Analysis of Algorithms - Chapter 7

14

The good-suffix shift example (2)


K

pattern

d2

pattern

d2

1 ABCBAB 2

1 BAOBAB 2

2 ABCBAB 4

2 BAOBAB 5

3 ABCBAB 4

3 BAOBAB 5

4 ABCBAB 4

4 BAOBAB 5

5 ABCBAB 4

5 BAOBAB 5

Design and Analysis of Algorithms - Chapter 7

15

Final rule for Boyer-Moore

Calculate shift as
d1 if k=0
d=
max(d1,d2) if k>0
where d1=max(t1(c)-k)

Example
BESS_KNEW_ABOUT_BAOBABS
BAOBAB
Design and Analysis of Algorithms - Chapter 7

16

Hashing

A very efficient method for implementing a


dictionary, i.e., a set with the operations:
insert

find
delete

Applications:
databases

symbol tables
Design and Analysis of Algorithms - Chapter 7

17

Hash tables and hash functions

Hash table: an array with indices that correspond to


buckets
Hash function: determines the bucket for each record
Example: student records, key=SSN. Hash function:
h(k) = k mod m
(k is a key and m is the number of buckets)
if m=1000, where is record with SSN= 315-17-4251 stored?

Hash function must:

be easy to compute
distribute keys evenly throughout the table

Design and Analysis of Algorithms - Chapter 7

18

Collisions

If h(k1) = h(k2) then there is a collision.


Good hash functions result in fewer collisions.
Collisions can never be completely eliminated.
Two types handle collisions differently:
Open hashing - bucket points to linked list of all keys
hashing to it.
Closed hashing one key per bucket, in case of collision,
find another bucket for one of the keys
linear probing: use next bucket
double hashing: use second hash function to compute increment

Design and Analysis of Algorithms - Chapter 7

19

Open hashing
If hash function distributes keys uniformly,
average length of linked list will be n/m
Average number of probes S = 1+/2, U =
Worst-case is still linear!
Open hashing still works if n>m.

Design and Analysis of Algorithms - Chapter 7

20

Closed hashing

Does not work if n>m.


Avoids pointers.
Deletions are not straightforward.
Number of probes to insert/find/delete a key depends
on load factor = n/m (hash table density)
successful search: () (1+ 1/(1- ))
unsuccessful search: () (1+ 1/(1- ))

As the table gets filled ( approaches 1), number of


probes increases dramatically:

Design and Analysis of Algorithms - Chapter 7

21

Вам также может понравиться