You are on page 1of 58

Chapter 1 Basics of Formal Language Theory

1.1 Generalities, Motivations, Problems

n this part of the course !e !ant to un"erstan" # $hat is a language% # &o! "o !e "e ne a language% # &o! "o !e manipulate languages, combine them% # $hat is the comple'ity of a language% (oughly, there are t!o "ual vie!s of languages) *+, The recognition point vie!. *B, The generation point of vie!.

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

2o matter ho! !e vie! a language, !e are typically con5 si"ering t!o things) *1, The synta',i.e.,!hatarethe6legal7stringsinthat language *!hat are the 6grammar rules7%,. *8, The semantics of strings in the language, i.e., !hat is the meaning *or interpretation,ofastring. The semantics is usually a lot more interesting than the synta' but unfortunately much more "i cult to "eal !ith9 Therefore, sorry, !e !ill only be "ealing !ith synta'9 n *+,, !e typically assume some :in" of 6blac: bo'7, M,*anautomaton,thatta:esastring,!,asinputan" returns t!o possible ans!ers) 4es,thestring! is accepte",!hichmeansthat! be5 longs to the language, L,that!earetryingto"e ne. 2o,thestring! is re;ecte" ,!hichmeansthat! "oes not belong to the language, L.

1.1. G/2/(+L T /0, M1T <+T 120, P(1BL/M0

3sually, the blac: bo' M gives a "e nite ans!er for every input after a nite number of steps, but not al!ays. For e'ample, a Turing machine may go on computing forever an" not give any ans!er for certain strings not in the language. This is an e'ample of un"eci"ability. The blac: bo' may compute "eterministically or non5 "eterministically,!hichmeansroughlythatoninput!, the machine M is allo!e" to try "i erent computations an" to ignore failing computations as long as there is some successful computation on input !. This a ects greatly the comple'ity of recognition, i.e,. ho! many steps it ta:es to process !.

>

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

0ometimes, a non"eterministic version of an automaton turns out to be e?uivalent to the "eterministic version *although, !ith "i erent comple'ity,. This ten"s to happen for very restrictive mo"els@!here non"eterminism "oes not help, or for very po!erful mo"els@!here again, non"eterminism "oes not help, but because the "eterministic mo"el is alrea"y very po!erful9 $e !ill investigate automata of increasing po!er of recog5 nition) *1, Aeterministic an" non"eterministic nite automata *AF+Bs an" 2F+Bs, their po!er is the same,. *8, Push"o!n automata *PA+Bs, an" "eterminstic push5 "o!n automata *APA+Bs,, here PA+ C APA+. *-, Aeterministic an" non"eterministic Turing machines *their po!er is the same,. *., f time permits, !e !ill also consi"er some restricte" type of Turing machine :no!n as LB+ *linear boun"e" automaton,.

1.1. G/2/(+L T /0, M1T <+T 120, P(1BL/M0

n *B,, !e are intereste" in formalisms that specify a language in terms of rules that allo! the generation of 6legal7 strings. The most common formalism is that of a formal grammar. (emember) # +n automaton recogniEes *or accepts,alanguage, # agrammargenerates alanguage. # grammarisspelle"!ithan6a7*not!ithan6e7,. # The plural of automaton is automata *not automatons,. For 6goo"7 classes of grammars, it is possible to buil" an automaton, MG,fromthegrammar,G,intheclass,so that MG recogniEes the language, L*G,, generate" by the grammar G.

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

&o!ever, grammars are non"eterministic in nature. Thus, even if !e try to avoi" non"eterministic automata, !e usually canBt escape having to "eal !ith them. $e !ill investigate the follo!ing types of grammars *the so5calle" Choms:y hierarchy,an"thecorrespon"ingfam5 ilies of languages) *1, (egular grammars *type -5languages,. *8, Conte't5free grammars *type 85languages,. *-, The recursively enumerable languages or r.e. sets *type G5languages,. *., f time permit, conte't5sensitive languages *type 15languages,. Miracle) The grammars of type *1,, *8,, *-,, *., corre5 spon" e'actly to the automata of the correspon"ing type9

1.1. G/2/(+L T /0, M1T <+T 120, P(1BL/M0

Furthermore, there are algorithms for converting gram5 mars to the correspon"ing automata *an" bac:!ar",, al5 though some of these algorithms are not practical. Buil"ing an automaton from a grammar is an important practical problem in language processing. + lot is :no!n for the regular an" the conte't5free grammars, but there is still room for improvements an" innovations9 There are other !ays of "e ning families of languages, for e'ample n"uctive closures. n this style of "e nition, a collection of basic *atomic, languages is speci e", some operations to combine lan5 guages are also speci e", an" the family of languages is "e ne" as the smallest one containing the given atomic languages an" close" un"er the operations.

1G

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

nvestigating closure properties *for e'ample, union, in5 tersection, is a !ay to assess ho! 6robust7 *or comple', afamilyoflanguagesis. $ell, it is no! time to be precise9

1.8 +lphabets, 0trings, Languages

1ur vie! of languages is that a language is a set of strings. n turn, a string is a nite se?uence of letters from some alphabet. These concepts are "e ne" rigorously as fol5 lo!s. Ae nition 1.8.1 +n alphabet isany nite set.
$e often !rite I Ja1,...,a:K.Theai are calle" the symbols of the alphabet.

1.8. +LP&+B/T0, 0T( 2G0, L+2G3+G/0

11

/'amples) IJaK IJa, b, cK IJG, 1K +stringisa nitese?uenceofsymbols.Technically,itis convenient to "e ne strings as functions. For any integer n 1, let LnMIJ1, 8,...,nK, an" for n IG,let LGM I . Ae nition 1.8.8 Given an alphabet , a string over *or simply a string, of length n is any function u)LnM . The integer n is the length of the string u,an"itis "enote" as NuN.$henn IG,thespecialstringu)LGM of length G is calle" the empty string, or null string,an" is "enote" as .

18

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

Given a string u)LnM oflengthn 1, u*i,isthe i5th letter in the string u.Forsimplicityofnotation,!e "enote the string u as
u I u1u8 ...un, !ith each ui .

For e'ample, if I Ja, bK an" u)L-M is"e ne"such that u*1, I a, u*8, I b,an"u*-, I a,!e!rite u I aba.
0trings of length 1 are functions u)L1M simplypic:ing some element u*1, I ai in . Thus, !e !ill i"entify every symbol ai !iththecorrespon"ingstringoflength1.

The set of all strings over an alphabet , inclu"ing the empty string, is "enote" as .

1.8. +LP&+B/T0, 0T( 2G0, L+2G3+G/0

1-

1bserve that !hen I ,then I J K. $hen I ,theset is countably in nite. Later on, !e !ill see !ays of or"ering an" enumerating strings. 0trings can be ;u'tapose", or concatenate". Ae nition 1.8.- Given an alphabet , given any t!o strings u)LmM an"v)LnM , the concatenation u O v *also !ritten uv,ofu an" v is the string uv)Lm P nM , "e ne" such that uv*i,I u*i,if1i m, v*i m,ifm P1 i m P n. n particular, u I u I u. t is imme"iately veri e" that u*v!,I*uv,!. Thus, concatenation is a binary operation on !hich is associative an" has as an i"entity.

1.

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

2ote that generally, uv I vu,fore'ampleforu I a an" v I b.


Given a string u an" n G, !e "e ne un as follo!s)
un I

if n IG, un 1u if n 1.

Clearly, u1 I u,an"itisaneasye'ercisetosho!that
unu I uun,

for all n G.

1.8. +LP&+B/T0, 0T( 2G0, L+2G3+G/0

1=

Ae nition 1.8.. Given an alphabet , given any t!o strings u, v !e "e ne the follo!ing notions as fol5 lo!s) u is a pre ' of v i there is some y such that v I uy. u is a su ' of v i there is some ' such that v I 'u.

u is a substring of v i there are some ', y such that v I 'uy.


$e say that u is a proper pre ' *su ', substring, of v i u is a pre ' *su ', substring, of v an" u I v.

1>

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

(ecall that a partial or"ering on a set 0 is a binary relation 0 0 !hich is re e'ive, transitive, an" antisymmetric. The concepts of pre ', su ', an" substring, "e ne binary relations on in the obvious !ay. t can be sho!n that these relations are partial or"erings. +nother important or"ering on strings is the le'icographic *or "ictionary, or"ering.

1.8. +LP&+B/T0, 0T( 2G0, L+2G3+G/0

1D

Ae nition 1.8.= Given an alphabet I Ja1,...,a:K

assume" totally or"ere" such that a1 Qa8 Q OOO Qa:, given any t!o strings u, v ,!e"e nethele'ico5 graphic or"ering as follo!s) uv
if v I uy,forsomey ,or if u I 'aiy, v I 'a;E, an" ai Qa;,forsome', y, E .

t is fairly te"ious to prove that the le'icographic or"ering is in fact a partial or"ering. n fact, it is a total or"ering, !hich means that for any t!o strings u, v ,either u v,orv u. The reversal !( of a string ! is "e ne" in"uctively as follo!s)
(I, *ua,( I au(,

!here a an"u .

1F

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

t can be sho!n that


*uv,( I v(u(.

Thus,
*u1 ...un,( I u( n ...u( 1 ,

an" !hen ui , !e have


*u1 ...un,( I un ...u1.

$e can no! "e ne languages. Ae nition 1.8.> Given an alphabet , a language over *or simply a language, is any subset L of . f I ,thereareuncountablymanylanguages. $e !ill try to single out countable 6tractable7 families of languages. $e !ill begin !ith the family of regular lan5 guages,an"thenprocee"totheconte't5free languages. $e no! turn to operations on languages.

1.-. 1P/(+T 120 12 L+2G3+G/0

1H

1.- 1perations on Languages

+!ayofbuil"ingmorecomple'languagesfromsimpler ones is to combine them using various operations. First, !e revie! the set5theoretic operations of union, intersec5 tion, an" complementation.
Given some alphabet , for any t!o languages L1,L8 over , the union L1 L8 of L1 an" L8 is the language L1 L8 I J! N ! L1 or ! L8K. The intersection L1 L8 of L1 an" L8 is the language L1 L8 I J! N ! L1 an" ! L8K. The "i erence L1 L8 of L1 an" L8 is the language L1 L8 I J! N ! L1 an" !RL8K.

The "i erence is also calle" the relative complement.

8G

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

+specialcaseofthe"i erenceisobtaine"!henL1 I , in !hich case !e "e ne the complement L of a language L as L I J! N !RLK. The above operations "o not use the structure of strings. The follo!ing operations use concatenation.
Ae nition 1.-.1 Given an alphabet , for any t!o lan5 guages L1,L8 over , the concatenation L1L8 of L1 an" L8 is the language L1L8 I J! N u L1, v L8,!I uvK. For any language L,!e"e neLn as follo!s) LG I J K, LnP1 I LnL.

1.-. 1P/(+T 120 12 L+2G3+G/0

81

The follo!ing properties are easily veri e") LI, LI, LJ K I L, J KL I L,


*L1 J K,L8 I L1L8 L8, L1*L8 J K,IL1L8 L1, LnL I LLn. n general, L1L8 I L8L1.

0o far, the operations that !e have intro"uce", e'cept complementation *since L I L is in nite if L is nite an" is nonempty,, preserve the niteness of languages. This is not the case for the ne't t!o operations.

88

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

Ae nition 1.-.8 Given an alphabet , for any lan5 guage L over , the Sleene 5closure L of L is the language LI Ln.
nG

The Sleene P5closure LP of L is the language


LP I
n1

Ln.

Thus, L is the in nite union


L I LG L1 L8 ... Ln ..., an" LP is the in nite union LP I L1 L8 ... Ln .... 0ince L1 I L,bothL an" LP contain L.

1.-. 1P/(+T 120 12 L+2G3+G/0

8-

n fact,
LP I J! , n 1, u1 L OOO un L, ! I u1 OOOunK, an" since LG I J K,

L I J K J! , n 1,
u1 L OOO un L, ! I u1 OOOunK.

Thus, the language L al!ays contains ,an"!ehave


L I LP J K.

8.

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

&o!ever, if RL,then RLP.Thefollo!ingiseasily sho!n) I J K, LP I L L, LIL, LLIL. The Sleene closures have many other interesting proper5 ties. &omomorphisms are also very useful. Given t!o alphabets , , a homomorphism h) bet!een an" is a function h) such that h*uv,Ih*u,h*v, for all u, v .

1.-. 1P/(+T 120 12 L+2G3+G/0

8=

Letting u I v I ,!eget h* ,Ih* ,h* ,, !hich implies that *!hy%, h* ,I .


f I Ja1,...,a:K,itiseasilyseenthath is completely "etermine" by h*a1,,...,h*a:,*!hy%,

/'ample) IJa, b, cK, IJG, 1K,an" h*a,IG1,h*b,IG11,h*c,IG111. For e'ample h*abbc,IG1G11G11G111.

8>

C&+PT/( 1. B+0 C0 1F F1(M+L L+2G3+G/ T&/1(4

Given any language L1 ,!e"e netheimage h*L1,

of L1 as h*L1,IJh*u, N u L1K. Given any language L8 ,!e"e nethe inverse image h 1*L8, of L8 as h 1*L8,IJu N h*u, L8K.

$e no! turn to the rst formalism for "e ning languages, Aeterministic Finite +utomata *AF+Bs,

Chapter 8 (egular Languages


8.1 Aeterministic Finite +utomata *AF+Bs,

First !e "e ne !hat AF+Bs are, an" then !e e'plain ho! they are use" to accept or re;ect strings. (oughly spea:5 ing, a AF+ is a nite transition graph !hose e"ges are labele" !ith letters from an alphabet . The graph also satis es certain properties that ma:e it "eterministic. Basically, this means that given any string !,startingfromanyno"e,thereisauni?uepathinthe graph 6parsing7 the string !.

8D

8F

C&+PT/( 8. (/G3L+( L+2G3+G/0

/'ample 1. + AF+ for the language


L1 I JabKP I JabK JabK,

i.e.,
L1 I Jab, abab, ababab,. . . , *ab,n,...K.

nput alphabet) I Ja, bK.


0tate set T1 I JG, 1, 8, -K.

0tart state) G.
0et of accepting states) F1 I J8K.
Transition table *function, 1)

G11-8 81---

ab

2ote that state - is a trap state or "ea" state.

8.1. A/T/(M 2 0T C F 2 T/ +3T1M+T+ *AF+B0,

8H

/'ample 8. + AF+ for the language


L8 I JabK I L1 J K

i.e.,
L8 I J , ab, abab, ababab, . . . , *ab,n,...K.

nput alphabet) I Ja, bK.


0tate set T8 I JG, 1, 8K.

0tart state) G.
0et of accepting states) F8 I JGK.
Transition table *function, 8)

G18 18G 888

ab

0tate 8 is a trap state or "ea" state.

-G

C&+PT/( 8. (/G3L+( L+2G3+G/0

/'ample -. + AF+ for the language


L- I Ja, bK JabbK.

2ote that L- consists of all strings of aBs an" bBs en"ing in abb. nput alphabet) I Ja, bK.
0tate set T- I JG, 1, 8, -K.

0tart state) G.
0et of accepting states) F- I J-K.
Transition table *function, -)

G1G 118 81-1G


s this a minimal AF+%

ab

8.1. A/T/(M 2 0T C F 2 T/ +3T1M+T+ *AF+B0, G a a a, b b 1 a 8

-1

Figure 8.1) AF+ for JabKP

a G b 1

8 a, b Figure 8.8) AF+ for JabK

ba G18ab a a Figure 8.-) AF+ for Ja, bK JabbK b

-8

C&+PT/( 8. (/G3L+( L+2G3+G/0

Ae nition 8.1.1 + "eterministic nite automaton *or AF+, is a ?uintuple A I*T, , ,?G,F,, !here

# isa niteinput alphabet # T is a nite set of statesU # F is a subset of T of nal *or accepting, statesU
# ?G T is the start state *or initial state,U

# is the transition function,afunction )T T.

8.1. A/T/(M 2 0T C F 2 T/ +3T1M+T+ *AF+B0,

--

For any state p T an" any input a , the state ? I *p, a,isuni?uely"etermine".Thus,itispossibleto "e ne the state reache" from a given state p T on input ! ,follo!ingthepathspeci e"by!.Technically, this is "one by "e ning the e'ten"e" transition function ) T T. Ae nition 8.1.8 Given a AF+ A I*T, , ,?G,F,, the e'ten"e" transition function ) T "e ne" as follo!s) *p, ,Ip, *p, ua,I * *p, u,,a,, !here a an"u . t is imme"iate that *p, a,I *p, a,fora . The meaning of *p, !,isthatitisthestatereache"from state p follo!ing the path from p speci e" by !. T is

-.

C&+PT/( 8. (/G3L+( L+2G3+G/0

t is also easy to sho! that *p, uv,I * *p, u,,v,. $e can no! "e ne ho! a AF+ accepts or re;ects a string. Ae nition 8.1.- Given a AF+ A I*T, , ,?G,F,, the language L*A, accepte" *or recogniEe", by A is the language
L*A,IJ! N *?G,!, FK.

Thus, a string ! is accepte" i the path from ?G on input ! en"s in a nal state. $e no! come to the rst of several e?uivalent "e nitions of the regular languages.

8.1. A/T/(M 2 0T C F 2 T/ +3T1M+T+ *AF+B0,

-=

(egular Languages, <ersion 1 +languageL is a regular language if it is accepte" by some AF+. 2ote that a regular language may be accepte" by many "i erent AF+s. Later on, !e !ill investigate ho! to n" minimal AF+Bs. *For a given regular language, L,amin5 imal AF+ for L is a AF+ !ith the smallest number of states among all AF+Bs accepting L.+minimalAF+ for L must e'ist since every nonempty subset of natural numbers has a smallest element., n or"er to un"erstan" ho! comple' the regular languages are, !e !ill investigate the closure properties of the reg5 ular languages un"er union, intersection, complementa5 tion, concatenation, an" Sleene . t turns out that the family of regular languages is close" un"er all these operations. For union, intersection, an" complementation, !e can use the cross5pro"uct construc5 tion !hich preserves "eterminism.

->

C&+PT/( 8. (/G3L+( L+2G3+G/0

&o!ever, for concatenation an" Sleene ,there"oesnot appear to be any metho" involving AF+Bs only. The !ay to "o it is to intro"uce non"eterministic nite automata *2F+Bs,.
8.8 The 6Cross5pro"uct7 Construction

Let I Ja1,...,amK be an alphabet. Given any t!o AF+Bs A1 I*T1, , 1,?G,1,F1,an" A8 I*T8, , 8,?G,8,F8,, there is a very useful construc5 tion for sho!ing that the union, the intersection, or the relative complement of regular languages, is a regular lan5 guage. Given any t!o languages L1,L8 over , recall that L1 L8 I J! N ! L1 or ! L8K, L1 L8 I J! N ! L1 an" ! L8K, L1 L8 I J! N ! L1 an" !RL8K.

8.8. T&/ 6C(1005P(1A3CT7 C120T(3CT 12

-D

Let us rst e'plain ho! to constuct a AF+ accepting the intersection L1 L8. Let A1 an" A8 be AF+Bs such that L1 I L*A1,an"L8 I L*A8,. The i"ea is to construct a AF+ simulating A1 an" A8 in parallel. This can be "one by using states !hich are pairs *p1,p8, T1 T8.Thus, !e "e ne the AF+ A as follo!s) A I*T1 T8, , ,*?G,1,?G,8,,F1 F8,,
!here the transition function )*T1 T8, T1 T8

is "e ne" as follo!s)


**p1,p8,,a,I* 1*p1,a,, 8*p8,a,,, for all p1 T1, p8 T8,an"a .

Clearly, A is a AF+, since A1 an" A8 are. +lso, by the "e nition of ,!ehave
**p1,p8,,!,I* 1*p1,!,, 8*p8,!,,, for all p1 T1, p8 T8,an"! .

-F

C&+PT/( 8. (/G3L+( L+2G3+G/0

2o!, !e have ! L*A1, L*A8,

i ! L*A1,an"! L*A8,, i 1*?G,1,!, F1 an" 8*?G,8,!, F8, i * 1*?G,1,!,, 8*?G,8,!,, F1 F8, i **?G,1,?G,8,,!, F1 F8, i ! L*A,. Thus, L*A,IL*A1, L*A8,.

$e can no! mo"ify A very easily to accept


L*A1, L*A8,. $e change the set of nal states so that it becomes *F1 T8, *T1 F8,. n"ee", ! L*A1, L*A8,

i ! L*A1,or! L*A8,, i 1*?G,1,!, F1 or 8*?G,8,!, F8, i * 1*?G,1,!,, 8*?G,8,!,, *F1 T8, *T1 F8,, i **?G,1,?G,8,,!, *F1 T8, *T1 F8,, i ! L*A,. Thus, L*A,IL*A1, L*A8,.

8.8. T&/ 6C(1005P(1A3CT7 C120T(3CT 12

-H

$e can also mo"ify A very easily to accept


L*A1, L*A8,. $e change the set of nal states so that it becomes F1 *T8 F8,. n"ee", ! L*A1, L*A8,

i ! L*A1,an"!RL*A8,, i 1*?G,1,!, F1 an" 8*?G,8,!, R F8, i * 1*?G,1,!,, 8*?G,8,!,, F1 *T8 F8,, i **?G,1,?G,8,,!, F1 *T8 F8,, i ! L*A,. Thus, L*A,IL*A1, L*A8,. n all cases, if A1 has n1 states an" A8 has n8 states, the AF+ A has n1n8 states.

.G

C&+PT/( 8. (/G3L+( L+2G3+G/0

8.- Morphisms, F 5Maps, B5Maps an" &omomorphisms of AF+Bs

+mapbet!eenAF+Bsisacertain:in"ofgraphho5 momorphism. The follo!ing Ae nition is a"apte" from /ilenberg. Ae nition 8.-.1 Given any t!o AF+Bs A1 I*T1, , 1,?G,1,F1,an"A8 I*T8, , 8,?G,8,F8,, a morphism h)A1 A8 of AF+Bs is a function h)T1 T8 satisfying the follo!ing con"itions) *1,
h* 1*p, a,, I 8*h*p,,a,, for all p T1 an" all a U
*8, h*?G,1,I?G,8.

+n F5map of AF+Bs,forshort,amap,isamorphism of AF+Bs h) A1 A8 that satis es the con"ition *-a, h*F1, F8.

+ B5map of AF+Bs is a morphism of AF+Bs


h)A1 A8 that satis es the con"ition *-b, h 1*F8, F1.

8.-. M1(P& 0M0, F 5M+P0, B5M+P0 +2A &1M1M1(P& 0M0 1F AF+B0 .1

+ proper homomorphism of AF+Bs,forshort,ahomo5 morphism,isanF5map of AF+Bs that is also a B5map of AF+Bs. 2o!, for any function f) V 4 an" any t!o subsets + V an" B 4 ,recallthat f*+,IJf*a, 4 N a +K f 1*B,IJ' V N f*', BK an" f*+, B i + f 1*B,. Thus, *-a, W *-b, is e?uivalent to the con"ition *-c, h 1*F8,IF1. 2ote that the con"ition for being a proper homomor5 phism of AF+Bs is not e?uivalent to
h*F1,IF8.

Con"ition *-c, forces h*F1,IF8 h*T1,, an" further5 more, for every p T1,!heneverh*p, F8,then p F1. The rea"er shoul" chec: that if f)A1 A8 an" g) A8 A- are morphisms *resp. F5maps, resp. B5maps,, then g f) A1 A- is also a morphism *resp. an F5map, resp. a B5map,.

.8

C&+PT/( 8. (/G3L+( L+2G3+G/0

(emar:) npreviousversionsofthesenotes,anF5map !as calle" simply a map an" a B5map !as calle" an F 15 map.1vertheyears,theol"terminologyprove"tobe confusing. $e hope the ne! one is less confusing9 2ote that an F5map or a B5map is a special case of the concept of simulation of automata. + proper homomor5 phism is a special case of a bisimulation.Bisimulations play an important role in real5time systems an" in con5 currency theory.
The main motivation behin" these "e nitions is that !hen there is an F5map h) A1 A8,someho!,A8 simulates A1,an"itturnsoutthatL*A1, L*A8,. $hen there is a B5map h) A1 A8,someho!,A1 sim5 ulates A8,an"itturnsoutthatL*A8, L*A1,. $hen there is a proper homomorphism h)A1 A8, someho!, A1 bisimulates A8,an"itturnsoutthat L*A8,IL*A1,.

8.-. M1(P& 0M0, F 5M+P0, B5M+P0 +2A &1M1M1(P& 0M0 1F AF+B0 .-

+AF+morphism,f) A1 A8,isanisomorphism i there is a AF+ morphism, g) A8 A1,sothat


g f Ii"A1 an" f g Ii"A8 .

0imilarly, an F5map *respectively, a B5map,, f) A1 A8,isanisomorphism i there is an F5map *respectively, a B5map,, g) A8 A1,sothat
g f Ii"A1 an" f g Ii"A8 .

The map g is uni?ue an" it is "enote" f 1.Therea"er shoul" prove that if a AF+ F5map is an isomorphism, then it is also a proper homomorphism an" if a AF+ B5map is an isomorphism, then it is also a proper homo5 morphism. f h)A1 A8 is a morphism of AF+Bs, it is easily sho!n by in"uction on the length of ! that
h* 1*p, !,, I 8*h*p,,!,, for all p T1 an" all ! .

+s a conse?uence, !e have the follo!ing Lemma)

..

C&+PT/( 8. (/G3L+( L+2G3+G/0

Lemma 8.-.8 f h)A1 A8 is an F5map of AF+Bs, then L*A1, L*A8,. fh)A1 A8 is a B5map of AF+Bs, then L*A8, L*A1,.Finally,ifh) A1 A8

is a proper homomorphism of AF+Bs, then L*A1,IL*A8,. +AF+isaccessible, or trim, if every state is reachable from the start state.
+morphism*resp.F5map, B5map, h) A1 A8 is sur5 ;ective if h*T1,IT8.

t can be sho!n that if A1 is trim, then there is at most one morphism h) A1 A8 *resp. F5map, B5map,. f A8 is also trim an" !e have a morphism h)A1 A8, then h is sur;ective.
t can also be sho!n that a minimal AF+ AL for L is characteriEe" by the property that there is uni?ue sur;ec5 tive proper homomorphism h) A AL from any trim AF+ A accepting L to AL.

8.-. M1(P& 0M0, F 5M+P0, B5M+P0 +2A &1M1M1(P& 0M0 1F AF+B0 .=

+nother useful notion is the notion of a congruence on a AF+.


Ae nition 8.-.- Given any AF+ A I*T, , ,?G,F,, a congruence on A is an e?uiva5 lence relation on T satisfying the follo!ing con"itions) For all p, ? T an" all a , *1, f p ?,then *p, a, *?,a,. *8, f p ? an" p F,then? F. t can be sho!n that a proper homomorphism of AF+Bs h)A1 A8 in"uces a congruence h on A1 "e ne" as follo!s)
p h ? i h*p,Ih*?,.

Given a congruence on a AF+ A,!ecan"e nethe ?uotient AF+ AR ,an"thereisasur;ectiveproper homomorphism ) A AR . $e !ill come bac: to this point !hen !e stu"y minimal AF+Bs.

.>

C&+PT/( 8. (/G3L+( L+2G3+G/0

8.. 2on"eteterministic Finite +utomata *2F+Bs,

2F+Bs are obtaine" from AF+Bs by allo!ing multiple tran5 sitions from a given state on a given input. This can be "one by "e ning *p, a,asasubset of T rather than a single state. t !ill also be convenient to allo! transitions on input . $e let 8T "enote the set of all subsets of T,inclu"ingthe empty set. The set 8T is the po!er set of T.$e"e ne 2F+Bs as follo!s.

8... 212A/T/T/(M 2 0T C F 2 T/ +3T1M+T+ *2F+B0,

.D

/'ample .. + 2F+ for the language


L- I Ja, bK JabbK.

nput alphabet) I Ja, bK.


0tate set T. I JG, 1, 8, -K.

0tart state) G.
0et of accepting states) F. I J-K.
Transition table .)

ab G JG, 1KJGK 1 J8K 8 J-K a, b G18abb

Figure 8..) 2F+ for Ja, bK JabbK

.F

C&+PT/( 8. (/G3L+( L+2G3+G/0

/'ample =. Let I Ja1,...,anK,let


Lin I J! N ! contains an o"" number of aiBsK,

an" let
Ln I L1 n L8 n OOO Ln

n.

The language Ln consists of those strings in that con5 tain an o"" number of some letter ai . /?uivalently Ln consists of those strings in !ith an even number of every letter ai .

t can be sho!n that that every AF+ accepting Ln has at least 8n states.

&o!ever, there is an 2F+ !ith 8n P1statesaccepting Ln *an" even !ith 8n states9,.

8... 212A/T/T/(M 2 0T C F 2 T/ +3T1M+T+ *2F+B0,

.H

Ae nition 8...1 + non"eterministic nite automa5 ton *or 2F+, is a ?uintuple 2 I*T, , ,?G,F,, !here

# isa niteinput alphabet # T is a nite set of statesU # F is a subset of T of nal *or accepting, statesU
# ?G T is the start state *or initial state,U

# is the transition function,afunction


) T * J K, 8T.

For any state p T an" any input a JK,the set of states *p, a,isuni?uely"etermine". $e!rite ? *p, a,.
Given an 2F+ 2 I*T, , ,?G,F,, !e !oul" li:e to "e ne the language accepte" by 2,an"forthis,!enee" to e'ten" the transition function ) T * J K, 8T

to a function
)T 8T.

=G

C&+PT/( 8. (/G3L+( L+2G3+G/0

The presence of 5transitions *i.e., !hen ? *p, ,, causes technical problems, an" to overcome these prob5 lems, !e intro"uce the notion of 5closure.
8.= 5Closure

Ae nition 8.=.1 Given an 2F+ 2 I*T, , ,?G,F, *!ith 5transitions, for every state p T,the 5closure of p is set 5closure*p,consistingofallstates? such that there is a path from p to ? !hose spelling is .Thismeans that either ? I p,orthatallthee"gesonthepathfrom p to ? have the label .
$e can compute 5closure*p,usingase?uenceofappro'5 imations as follo!s. Ae ne the se?uence of sets of states * 5cloi*p,,i G as follo!s) 5cloG*p,IJpK, 5cloiP1*p,I 5cloi*p, J? T N s 5cloi*p,,? *s, ,K.

8.=. 5CL103(/

=1

0ince 5cloi*p, 5cloiP1*p,, 5cloi*p, T,foralli G, an" T is nite, there is a smallest i,sayiG,suchthat 5cloiG *p,I 5cloiGP1*p,,

an" it is imme"iately veri e" that


5closure*p,I 5cloiG*p,.

$hen 2 has no 5transitions, i.e., !hen *p, ,I for all p T *!hich means that can be vie!e" as a function )T 8T,, !e have 5closure*p,IJpK. t shoul" be note" that there are more e cient !ays of computing 5closure*p,, for e'ample, using a stac: *basi5 cally, a :in" of "epth5 rst search,. $e present such an algorithm belo!. t is assume" that the types 2F+ an" stac: are "e ne". f n is the number of states of an 2F+ 2,!elet eclotype I arrayL1..nM of boolean

=8

C&+PT/( 8. (/G3L+( L+2G3+G/0

function eclosureL2) 2F+,p) integerM) eclotypeU begin var eclo)eclotype, ?,s)integer, st)stac:U for each ? setstates*2, "o ecloL?M)IfalseU en"for ecloLpM)ItrueU st )I emptyU trans )I "eltatable*2,U st )I push*st, p,U !hile st I emptystac: "o ? I pop*st,U for each s trans*?, , "o if ecloLsMIfalse then ecloLsM)ItrueU st )I push*st, s, en"if en"for en"!hileU eclosure )I eclo en" This algorithm can be easily a"apte" to compute the set of states reachable from a given state p *in a AF+ or an 2F+,.

8.=. 5CL103(/

=-

Given a subset 0 of T,!e"e ne 5closure*0,as 5closure*0,I


p0

5closure*p,.

$hen 2 has no 5transitions, !e have 5closure*0,I0.


$e are no! rea"y to "e ne the e'tension ) T 8T

of the transition function )T * J K, 8T.

=.

C&+PT/( 8. (/G3L+( L+2G3+G/0

8.> Converting an 2F+ into a AF+

The intuition behin" the "e nition of the e'ten"e" transi5 tion function is that *p, !,isthesetofallstatesreach5 able from p by a path !hose spelling is !. Ae nition 8.>.1 Given an 2F+ 2 I*T, , ,?G,F, *!ith 5transitions,, the e'ten"e" transition function ) T 8T is "e ne" as follo!s) for every p T, every u ,an"everya , *p, ,I 5closure*JpK,, *p, ua,I 5closure*
s *p,u,

*s, a,,.

The language L*2, accepte" by an 2F+ 2 is the set


L*2,IJ! N *?G,!, F I K.

8.>. C12</(T 2G +2 2F+ 2T1 + AF+

==

$e can also e'ten" ) T 8T to a function


)8T 8T

"e ne" as follo!s) for every subset 0 of T,forevery !, *0, !,I *p, !,.
p0

Let T be the subset of 8T consisting of those subsets 0 of T that are 5close", i.e., such that 0 I 5closure*0,. f !e consi"er the restriction

)TT of )8T 8T to T an" , !e observe that is the transition function of a AF+. n"ee", this is the transition function of a AF+ accepting L*2,. t is easy to sho! that is"e ne""irectlyasfollo!s*onsubsets0 in T,) *0, a,I 5closure*
s0

*s, a,,.

=>

C&+PT/( 8. (/G3L+( L+2G3+G/0

Then, the AF+ A is "e ne" as follo!s)


A I*T, , , 5closure*J?GK,, F,,

!here F I J0 TN0 F I K. t is not "i cult to sho! that L*A,IL*2,, that is, A is a AF+ accepting L*2,. Thus, !e have converte" the 2F+ 2 into a AF+ A *an" gotten ri" of 5transitions,. 0ince AF+Bs are special 2F+Bs, the subset construction sho!s that AF+Bs an" 2F+Bs accept the same family of languages, the regular languages, version 1 *although not !ith the same comple'ity,. The states of the AF+ A e?uivalent to 2 are 5close" subsets of T.Forthisreason,theaboveconstructionis often calle" the 6subset construction7. t is "ue to (abin an" 0cott. +lthough theoretically ne, the metho" may construct useless sets 0 that are not reachable from the start state 5closure*J?GK,. + more economical construc5 tion is given ne't.

8.>. C12</(T 2G +2 2F+ 2T1 + AF+

=D

+n +lgorithm to convert an 2F+ into a AF+) The 6subset construction7


Given an input 2F+ 2 I*T, , ,?G,F,, a AF+ A I *S, , ,0G, F,isconstructe". tisassume"thatS is alineararrayofsetsofstates0 T,an" isa85 "imensional array, !here Li, aMisthetargetstateofthe transition from SLiMI0 on input a,!ith0 S,an" a.
0G )I 5closure*J?GK,U total )I 1U SL1M )I 0GU

mar:e" )I GU !hile mar:e" Q total "oU mar:e" )I mar:e" P1U0 )I SLmar:e"MU for each a "o 3 )I s 0 *s, a,U T )I 5closure*3,U if TRS then total )I total P1USLtotalM)IT en"ifU Lmar:e", aM)IT en"for en"!hileU F )I J0 S N 0 F I K

=F

C&+PT/( 8. (/G3L+( L+2G3+G/0

Let us illustrate the subset construction on the 2F+ of /'ample .. +2F+forthelanguage


L- I Ja, bK JabbK.
Transition table .)

ab G JG, 1KJGK 1 J8K 8 J-K 0et of accepting states) F. I J-K.


a, b G18abb

Figure 8.=) 2F+ for Ja, bK JabbK

8.>. C12</(T 2G +2 2F+ 2T1 + AF+ >G

C&+PT/( 8. (/G3L+( L+2G3+G/0

=H

+fter The pointer the thir" correspon"s roun" through to mar:e" the !hile an" loop. the pointer to total. names states ab + JGK B+ nitial transition table B. JG, 1K BC C JG, states 8K BA names ab + JGK A JG, -K Xust after entering the !hile loop +fter the fourth roun" through the !hile loop. namesstates statesab ab names +JGK JGKB+ + B JG, 1K +fter the rst roun" through theBC !hile loop. C JG, 8K BA names states ab A JG, -K B+ + JGK B+ B JG, 8.-, 1K e'cept that in that e'ample This is the AF+ of Figure +, B, ;ust C, A are rename" 1, 8,loop. +fter reentering the G, !hile
ba G18-

names states ab + JGK B+ ab B JG, 1K


a

a !hile loop. +fter the secon" roun" through the Figure 8.>) AF+ for Ja, bK JabbK

names states ab + JGK B+ B JG, 1K BC C JG, 8K