Академический Документы
Профессиональный Документы
Культура Документы
Suff Automata
Suff Automata
. .
: . .
-
2007
1.
1.1. . . . . . . . . . . . . . .
1.2. . . . . . . .
1.3. . . . . . . .
1.4. . . . . . . . .
1.5. . . . . . .
1.6. . . . . .
1.7.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2.
2.1. . . . . . .
2.2.
2.3. . .
2.4.
2.5.
3.
3.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.1.1. . . . . . . . . . . . . . . . . . . . .
3.1.2. . . . . . . . . . . . . . . . . . . . .
3.1.3. . . . . . . . . . . . . . . . . . . .
3.1.4. . . . . . . . . . . . . . . . .
3.2. . . .
4
4
5
7
7
9
10
12
15
15
16
17
19
22
25
25
26
27
27
27
28
34
35
[1]. [1]
.
, [1]. ,
[2], - [3] [4]
[1] .
,
.
. ,
( [5],
- [5]),
.
, ,
, .
,
. ,
.
1.
1.1.
1. [5, . 3] , .
. .
+ = \ {}.
2. [5, . 3] w, |w|
w, w.
3. [5, . 4] w
u, x y, u = xwy.
4. [5, . 4] w
u, y, u = wy.
5. [5, . 4] w
u, x, u = xw.
6. x
F act(x).
7. x
Suf f (x).
8. [5, . 4] k w
w[1 . . . k].
9. k- w x[k].
1.2.
6
14.
C y Repry (C), .
15.
C y Repry (C), .
4.
,
. :
wmax wmin C y. , s Repry (C)
, s wmax |s| |wmin |.
s Repry (C). , s Suf f (wmax) |s| |wmin |.
14 wmax Repry (C) |wmax| |s|. 1
, s Suf f (wmax). , 15
|s| |wmin|, .
s Suf f (wmax) |s| |wmin|. , s Repry (C).
15 1 wmin Suf f (s). 14
15 , wmax y wmin. , 2 wmax y s y wmin,
.
.
5.
.
Ry (x). wmax
, lenmin . ,
4 Repry (Ry (x)) = {z | z Suf f (wmax) |z| lenmin}.
6. C len y len, , Rw (y) = C.
, y1 y2 len ,
C = Rw (y1 ) = Rw (y2). , y1 = y2 . 1 y1
Suf f (y2), y2 Suf f (y1). , y1 y2
, , , y1 = y2 ,
.
1.3.
1.4.
17. T w . ( ). ,
, . |w|+1
, , . , ()
() w.
8
18. [1, . 122]
,
.
.
19. [1, . 122] v v v.
20. v .
21. u
RT (u) , u , u.
22. , v
Rw (x), RT (v) = Rw (x).
23. v u ,
RT (v) = RT (u).
. 1 aabb. .
b
. 1. aabb
9. v v.
9
v x. , RT (v) =
Rw (x).
, y RT (v) , y Rw (x). y
RT (v) , xy w, 10
, y Rw (x).
10. w , , .
Rw (x) . Rw (x) ,
x w. , x w, ,
v x.
9.
1.5.
() ([5,
. 108], [1, . 121]). .
24. [5, . 108]
w, , ,
, .
. 2 aabb.
.
25. [1, . 122] ,
(u, v), (u, v) .
u (u, v)
, .
. 10 ,
w
( ).
10
a
bb
abb
. 2. aabb
1.6.
b
b
a
b
a
. 3. aabb
27. , s Rw (x), RA (s) = Rw (x).
11. [5]
w.
11
, , , ,
.
12. , s, .
, , 26, 13 9.
28. s , x.
s , s , s(x).
suf f ix(s).
29. s
reprmax(s).
. , ([5, . 126]) ( - ), , ,
, .
13. s . s
1 + reprmax (suf f ix(s)).
28 29,
7.
14. x s1 . s2 ,
, x[2 . . . |x|].
:
|x| = reprmax(suf f ix(s1)) + 1, s2 suf f ix(s1)
12
|x| > reprmax(suf f ix(s1)) + 1, s2 s1
, |x| > reprmax(suf f ix(s1))+1. 28
16 , x[2 . . . |x|] Rw (x). , , x[2 . . . |x|],
s1 .
, |x| = reprmax(suf f ix(s1))+1. ,
x[2 . . . |x|] Rw (sw (x)), 16 , , x[2 . . . |x|],
suf f ix(s1).
30. [5, . 132] , s , :
s
s
1.7.
b
bb
abb
. 4. aabb
13
15. A w.
A x s1 s2.
x = w[|w| |y| |x| + 1 . . . |w| |y|], y RA (s2).
, A x
s1 s2 y RA (s2), , xy RA (s1). ,
, , xy Suf f (w).
xy w,
xy = w[|w| |xy| + 1 . . . |w|].
, x = w[|w| |xy| + 1 . . . |w| |y|] = w[|w| |x| +
|y| + 1 . . . |w| |y|], .
32. longest(s) = maxxRA (s) |x|, , , s.
33. f ork, :
s , f ork(s) = s
s , f ork(s) = t, t , s
16. w.
c s1 s2 . :
w s1 f ork(s2) y,
c.
y = w[|w| longest(s2 ) . . . |w| longest(f ork(s2))].
, s2
f ork(s2), , , , .
31.
, 1 +
longest(s2 ) = |y| + longest(f ork(s2)), 15.
14
, w f ork longest,
O(|w|).
15
2.
2.1.
34. , u
s , RT (u) = RA (s).
. ,
, .
17. w
w.
26.
s
, .
.
.
,
.
6, . , ,
16
, s,
.
, s
( ).
.
18. hs, |r|i, s
, r s.
2.2.
, , ,
hs, |r|i, s
, r s.
17
2.3.
.
.
1
return trie ;
7
8
SuffixAutomaton auto ,
10
11
int state )
{
12
13
14
15
16
17
18
19
18
20
21
return node ;
22
}
, buildTrie(A)
A. , walk(T , A, s)
T , s A,
.
x, x RA (s) , x
walk .
.
x = . RA (s) , s . 13-15 , node
, RA (s).
.
, x 6= . RA (s) , A,
s.
, x RA (s), T x.
x RA (s) , A, s, x. , , , A
s x[1]. ,
16-20. , newN ode x[2 . . . |x|]. ,
x.
, x 6 RA (s), T x.
. , x. , node
x[1]. 19. , newN ode
x[2 . . . |x|], ,
A x[2 . . . |x|], s2 , , x, s,
19
x 6 RA (s). , ,
node x .
. , O(n), n .
2.4.
.
.
1
return tree ;
7
8
10
11
int state )
{
12
13
14
15
20
16
17
18
19
edge . getBegin () ,
20
edge . getEnd () );
21
22
return node ;
23
}
, ,
, , . ,
. ,
, , , .
.
22. ,
walk, , .
, node, walk,
, , A
state .
, . ,
, , 13-15 node
, .
. O(n),
n .
, ,
.
, , , ,
,
w.
21
15, w .
.
f ork longest.
2.
.
1
return tree ;
7
8
SuffixAutomaton auto ,
10
11
int state )
{
12
13
14
15
16
17
22
18
19
20
21
22
23
return node ;
24
}
, .
2.5.
,
w :
w O(|w|);
, s
f ork(s) longest(s);
, .
, w
:
;
,
.
, , .
23
, ,
O(|w|) f ork
longest, , , .
. 1 , .
,
.
1.
{a, b}
. .
100000
0.177
0.125
200000
0.388
0.266
300000
0.596
0.428
400000
0.819
0.580
500000
1.035
0.730
600000
1.264
0.884
700000
1.470
1.047
800000
1.694
1.216
900000
1.923
1.342
1000000
2.153
1.502
.
, -
24
. ,
,
.
, ,
,
.
25
3.
. , . , .
3.1.
, :
;
, ;
;
;
.
.
,
h , i.
, h
, i, .
26
, O(|w|) . ,
h , i.
. .
, -
. . v1 v2
, :
v1 v2 s1 s2 ,
s1 s2 ;
v1 v2 ,
v1 v2.
h , i .
, num , ,
lenmin ,
len
num + len lenmin .
, O(|w|) O(1).
.
3.1.1.
, v s , RT (v) =
Suf f (w) = RA (s).
, , h , 0 i, .
27
3.1.2.
, .
v
, RT (v).
s ,
v. v ,
RA (s). , , , s .
, v , s
.
,
O(1).
3.1.3.
.
v <s, len>.
14 , :
len v
s, ,
hs, len 1i;
s, , hsuf f ix(s), len 1i.
, O(1).
3.1.4.
v
, c.
28
x v.
v s .
s
c, xc w, , ,
, v ,
c.
,
s t c. f = f ork(t). 31 33 ,
s f.
, 21 16 , s f y, c. , y = w[|w|longest(s2 ) . . . |w|longest(f ork(s2 ))].
, f ork longest,
v , c, , O(1)
c,
.
3.2.
(LCP ) .
1 s , .
si =
s[i . . . |s|] sj = s[j . . . |s|].
,
O(|s|) , O(log |s|) O(1).
29
,
:
;
;
k ;
;
.
LCP
.
s. ,
.
.
s,
.
, ,
.
, ,
,
3.1.
[6],
O(log |s|).
:
s;
30
.
, ;
.
:
t1 t2
;
, , t1 t2 ;
.
, ,
. , :
s
;
.
2.
:
1
globalTime = 0;
31
7
8
9
10
int state ,
11
int depth )
12
13
14
15
16
17
18
19
20
21
22
23
24
}
}
, time2depth suf f ix2time. , walk
.
, . ,
walk node .
node , state . , node , , depth.
14 suf f ix2time , depth.
32
16 time2depth node.
17 - 23 node
. node, time2depth 22.
. 2 ,
, 1.3 - 1.9 .
2. LCP .
O(log |w|).
{a, b}
. .
100000
0.163
0.234
200000
0.323
0.495
300000
0.511
0.769
400000
0.683
1.017
500000
0.852
1.289
600000
1.039
1.552
700000
1.213
1.826
800000
1.381
2.103
900000
1.563
2.381
1000000
1.735
2.631
,
.
33
, ,
. .
34
.
.
( -) , . , . ,
, .
, . ,
.
,
,
.
. , .
,
.
, :
, ,
. , , ,
.
35
1. . , . . .: ; , 2003.
2. Weiner P. Linear pattern matching algorithms / Proc. of the 14th IEEE
Symp. on Swithing and Automata Theory. 1973, pp.1-11.
3. McCreight E. M. A space-economical sux tree construction alorithm // J.
ACM. 1976. Vol 23, pp. 262-272.
4. Ukkonen E. Online construction of sux-trees // Algoritmica. 1995. Vol. 14,
pp. 249-260.
5. Lothaire M. Applied Combinatorics on Words // Encyclopedia of
Mathematics and its Applications, 2005. Vol. 90. Cambridge University
Press, Cambridge.
6. Bender M., Farach-Colton M. The LCA Problem Revisited / LATIN 2000,
pp. 88-94.
7. ., ., . , . .: , 2002.
8. ., ., . : .
.: , 2000.
9. Sartaj Sahni Dr. Data Structures, Algorithms, & Applications in
Java. Sux Trees. CISE Department Chair at University of Florida.
http://www.cise.u.edu/sahni/dsaaj/enrich/c16/sux.htm
10. Edelkamp S. Sux tree // Dictionary of Algorithms and Data
Structures. U.S. National Institute of Standards and Technology. 2007.
http://www.nist.gov/dads/HTML/suxtree.html
36
11. Blumer A., Blumer J., Ehrenfeucht A., Hausler D., McConnel R. Linear
size nite automata for the set of all subwords of a word: an outline of results
// Bull. Eur. Assoc. Theoret. Commput. Sci. 1983. Vol 21, pp. 12-20.
12. Blumer A., Blumer J., Ehrenfeucht A., Hausler D., McConnel R. The
smallest automaton recognizing the subwords of a text // Bull. Eur. Assoc.
Theoret. Commput. Sci. 1985. Vol 40(1), pp. 31-55.
13. Eilenberg S. Automata, Languages, and Machines. Vol A. Academic Press.
1974.
14. Kuich W., Salomaa A. Semirings, Automata, Languages. Springer-Verlag.
1986.
15. Alonso L., Remy J. L., Schott R. A linear-time algorithm for the generation
of trees // Algorithmica. 1997. Vol 17(2), pp. 162-182.
16. Devroye L., Szpankowski W., Rais B. A note of the height of sux trees
// SIAM J. Comput. 1992. Vol. 21, pp. 48-53.
17. Farach M. Optimal sux tree construction with large alphabets // In 38th
Foundations of Computer Science (FOCS). 1997, pp. 137-143.