Вы находитесь на странице: 1из 2

R

IS

UN

Dr. Surender Baswana

FR 6.2 Informatik

E R SIT

I N F O R M A T I K

SA

Universitat
des
Saarlandes

IV

A VIE N

WS 2004/5

Exercises for Randomized Algorithms


6. Assignment

Due : 10 February.

Be very rigorous in your arguments. Specify any probability Lemma/Theorem explicitly before you use it.
Exercise 1 (1,1,2,2,2,2)

a) In this problem we shall use different fingerprinting technique to solve the pattern matching problem.
The idea is to map any bit string s into a 2  2 matrix M (s), as follows.






For the empty string , M () =


M (0) =

M (1) =

1
0

0

1

1

1

For non empty strings x and y , M (xy ) = M (x)  M (y )

Show that this fingerprint function has the following properties.


(a) M (x) is well defined for all x 2 f0; 1g . Moreover, M (x) = M (y ) =) x = y

(b) For x 2 f0; 1gn , the entries in M (x) are bounded by Fibonacci number Fn .

(c) By considering the matrices M (x) modulo a suitable prime p, show how you would perform efficient fingerprint matching. If n is the length of the two bit strings x; y whose equality we want to
test, how large should be the domain from which we select the random prime number p to ensure
that our algorithm gives correct result with probability at-least 1 1=n.
b) Consider the two-dimensional version of pattern matching problem. The text is an n  n matrix X and
the pattern is an m  m matrix Y . A pattern match occurs if Y appears as a contiguous sub-matrix of X .
There is a naive deterministic algorithm that would find all matches in O ((n m + 1)2 m2 ) time. Our
objective is to design an efficient randomized algorithm based on the fingerprinting technique mentioned
above.
Convert the matrix Y into an m2 -bit vector using the row major format. The possible occurrence of Y
in X are the m2 -bit vectors X (j ) obtained by taking all (n m + 1)2 sub-matrices of X in row-major
form. It is clear that we can apply the above mentioned fingerprinting technique to this scenario.
(a) Computing fingerprints of each X (j ) right from scratch would take  (m2 ) time, so we dont get
an efficient randomized algorithm. To make the algorithm efficient, explain how the fingerprints of
each X (j ) can be computed at a small incremental cost of O (m). Give all details.

(b) Since we are computing fingerprint of each X (j ) incrementally, there is likelihood of propagation
of error (think over it). How large should be the domain from which we select a prime number that
would ensure that the probability of the event - X (j ) 6= Y but fingerprint of X (j ) is same as
that of Y is very small for arbitrarily large j  n m + 1.
(c) What is the overall complexity of the entire randomized algorithm that finds all matches of pattern
Y in X with probability 1 1=n ?
Exercise 2 (1,4,5)
a) Ponder over it (you need not submit this part but this part will help you understand and appreciate the
part b) better): In the lecture class held on 31=1=2005, we described a Monte Carlo algorithm for finding
a solution of 2-SAT. If you try to apply same strategy for 3-SAT, what problem do you face ?
b) Let G be a 3-colorable graph, that is, we can color all vertices using three colors such that no edge would
be monochromatic. Consider the following algorithm for coloring the vertices of G with 2-colors so that
no triangle of G is monochromatic.
While there is a monochromatic triangle, it chooses one such triangle, and changes the color of randomly selected vertex of that triangle.
Our aim is to derive an upper bound of n2 =6 on the expected number of such re-coloring steps before the
algorithm finds a 2-coloring with desired property.
We shall analyse this algorithm by first suitably defining state of the algorithm and then showing a
mapping between the state changes of the algorithm with the random walk on a line. Simple idea that
would come to your mind will be : Since the graph is 3-colorable, it has a valid 2-coloring with no
monochromatic triangle. So let f0 be such a 2-coloring of the graph with no monochromatic triangle. Let
us define the state of the algorithm to be the number of vertices that are colored with same color as in the
coloring f0 . Here again, you will face difficulty similar (but not same) in nature to part a) if you attempt
to show a polynomial bound on the expected number of re-colorings required.
So we need to define state of the algorithm in a slightly different manner. We introduce a new coloring
terminology. A fair partial 2-coloring of the graph is a pair of disjoint color sets V1 and V2 of vertices
(with color 1 and 2 respectively) such that each triangle in the graph has exactly one vertex from set V1
and exactly one vertex from set V2 (so a fair coloring). Note that V1 \ V2 = ;, and V1 [ V2  V , that is,
not all vertices of the graph are necessarily colored in f0 (so a partial coloring).
(a) Show that G has a fair partial 2-coloring.
(b) Let f0 be a fair partial 2-coloring of G. Let f be the the 2-coloring that the algorithm has currently
arrived at. Then the state of the algorithm N (f ) is the set of vertices with exactly same color as
in the fair partial 2-coloring f0 . More precisely, each vertex contributes to N (f ) as follows : If it
is colored with same color as in f0 , it contributes 1 to N (f ). Otherwise it is colored with different
color in f0 or is not colored at all (it is possible since f0 is partial coloring), and then it contributes
0 to N (f ).
Let f be a coloring the algorithms has reached, and let f 0 be the coloring after one recoloring performed on f by the algorithm as mentioned above (recoloring one of the vertex picked randomly from
a monochromatic triangle). What are the possible values of N (f 0 ) ? What are the corresponding
probability of each state change ? Give appropriate arguments in support of your answer.
(c) Analyse the number of re-colorings required by the algorithm with the new state defined above and
show that their expected number is bounded by O (n2 =6). Show all calculations in support of your
answer.

Вам также может понравиться