Вы находитесь на странице: 1из 75

Appendix A

Pascal Syntax Flow Graph


This Pascal syntax flow graph is a copy from “Pascal User Manual and Report” by Kathleen Jensen and Niklaus
Wirth (Springer-Verlag, 1974) with a courtesy of the publisher. Refer this book for further details of Pascal grammar.

508
Pascal Syntax Flow Graph
<program>
PROGRAM identifier ( identifier ) ; block .

<block> ,
LABEL Unsigned integer
,
;

CONSTANT identifier = constant

TYPE identifier = type

VAR identifier : type


,
;

; block ;

PROCEDURE identifier parameter list

FUNCTION identifier parameter list : identifier

BEGIN statement END


509
unsigned integer : <statement> Pascal Syntax Flow Graph

variable := expression

function identifier ,

function identifier ( expression )

; procedure identifier

BEGIN statement END

IF expression THEN statement ELSE statement

CASE expression OF constant : statement END

,
;
WHILE expression DO statement

REPEAT expression UNTIL statement


statement
; DOWNTO

FOR Variable identifier := expression TO expression DO

WITH variable DO statement

,
GOTO unsigned integer 510
Pascal Syntax Flow Graph

<identifier>
letter <unsigned integer>
letter
digit
digit

<unsigned number>
+
unsigned integer . unsigned integer E unsigned integer
-

<unsigned constant> <constant>


constant identifier + constant identifier

unsigned number unsigned number


-
NIL
' character '
' character '

511
Pascal Syntax Flow Graph
<type>
simple type
<simple type>
 type identifier
type identifier
PACKED ,
( identifier ) simple type type
ARRAY [ ] OF
,
FILE OF type
constant .. constant
SET OF simple type

<field list> RECORD field list END


;

identifier : type

CASE identifier : type identifier OF

;
,
constant : ( field list )
512
Pascal Syntax Flow Graph
<variable>
variable identifier

field identifier [ identifier ]

. field identifier <term>


 factor

<factor> factor
unsigned constant
*
variable
,
/
function identifier ( expression )
DIV
( expression )
MOD
NOT factor

AND
[ ]

expression .. expression

,
513
Pascal Syntax Flow Graph
<simple Expression>
+
term
-
+ - OR

<expression> term

simple expression

= < >    IN

simple expression
<parameter list>

( identifier : type identifier )

FUNCTION ,

VAR ,

PROCEDURE identifier

514
Appendix B
Converting a 2-way DFA to a 1-way DFA

a .... b a .... b

q0 r0

(a) 2-way FA M (a) 1-way FA M’

515
2-way FA 1-way FA

Here we will show that every language recognizable by a 2-way DFA can also be
recognizable by a 1-way FA. Actually, we will show an algorithm which given an
arbitrary 2-way DFA M, constructs an 1-way DFA M’ recognizing the same language.
(This algorithm was first presented by J. C. Sheperdson in IBM J. Res. 3: 2, 198-200.)
Recall that both 1-way FA and 2-way FA have the same starting and accepting
configuration, i.e., initially they read the leftmost input symbol and accepts the input
string by falling off to the right of the rightmost input symbol in an accepting state.
(Recall that the blank symbol does not belong to the input alphabet, and hence, they
are not allowed to read it during the computation.)

a .... b a .... b

q0 qf

(a) Starting configuration of M (b) Accepting configuration of M

516
2-way FA 1-way FA
The Algorithm
Before we present the conversion algorithm, lets examine how M works with an
example shown in figure (a) below. Given the input string aba, this automaton M will
move as shown in figure (b). We are interested in the sequences of states and
directions while the machine moves back and forth across the cell boundaries. In the
figure each arrow shows the direction together with the machine’s state.

(a, R), (b, L) a b a


(a, R)
(b, R) 2 1
3 2
(b, R) 3
1 (a, L) 3
(b, R)
4
(a, L) 4 5 1
(b, L) 2 Accept!
4
(a, R) 5
5 3 3

(a) 2-way DFA M (b) The computing profile of M on input aba

517
Algorithm 2-way FA 1-way FA

Let Q = {1, 2, . . . , n} be the set of states of M. If the machine crosses a cell


boundary to the left in state i and crosses back the same boundary to the right in a
state qi  Q, then we shall call the state pair (i, qi) a crossing state pair. If there is no
such state qi (i.e., it never crosses back), we shall denote it by 0. Note that on each
cell boundary, there are n crossing state pairs. We shall call this list of pairs the
crossing information (shortly, CI).

.... ....
.... .... 1
q1
i 2
q2
.
.
qi .
n
qn

(a) Crossing state pair (i, qi) (b) Crossing information CI


518
Algorithm 2-way FA 1-way FA
The key idea of the algorithm is that given the CI on a boundary ci and the input
symbol in its right adjacent cell, it is possible to compute the CI on the next cell
boundary ci+1. Figure (b) shows how it is possible with M. (Recall that in a CI, 0
implies that M never crosses back the boundary.)
ci ci+1
.... a ...

(b, L) (a, R) 1 A
(a, R), q1 2
(b, R) 2 N
2 2
3 I
(b, R) q2 0
1 (a, L) 0
(b, R)
3
(a, L) 4 5 q3 3
3
(b, L) 4
(a, R) q4 0 0
5
q5 5
(a) 2-way DFA M 5
(b) Computing CI on ci+1
519
Algorithm 2-way FA 1-way FA

Now, with M in Figure (a) below we shall show how it is possible to accept the
input string aba  L(M), going 1-way to the right. Since it is illegal for M to cross the
leftmost cell boundary c0 to the left, for every state i, its crossing state pair is (i, 0)
because the machine never crosses the boundary back to the right.
c0 c1 c2 c3

a b a
1
(a, R), (b, L) (a, R)
q1 0
(b, R) 2
3 2
(b, R) q2 0
1 (a, L)
(b, R)
3
(a, L) 4 5 q3 0
(b, L) 4
(a, R) q4 0

(a) 2-way DFA M 5


q5 0 (b) Starting CI

520
Algorithm 2-way FA 1-way FA
Let q0 denote the state in which M crosses a cell boundary first time to the right. With
the CI on c0 we first find q0 in which M crosses c1 to the right and then compute the CI
on c1 with the CI on c0. Since by convention M starts reading the leftmost input symbol
in the start state 1, q0 on c1 is 2. With the CI on c1 computed, we find q0 (state 4) on c2,
and compute the CI on c2 and so on.
c0 c1 c2 c3

a b a
(a, R), (b, L) (a, R) q0 2 4
1
(b, R) 1
2
3 q1 0 2 A
(b, R)
1 (a, L)
2 N
(b, R)
q2 0 0 I
(a, L) 4 5 3
(b, L) q3 0 3
(a, R) 4
q4 0 0
(a) 2-way DFA M 5 (b) Computing CI
q5 0 5
521
Algorithm 2-way FA 1-way FA

The following figure shows how to compute CI on c2 with CI on c1, and then find q0
on c3 (i.e., state 3). This figure shows that M, starting in state 1, crosses the
boundaries to the right in states 2, 4 and 3, in this order, and accepts the input string.
c0 c1 c2 c3

a b a
q0 1 2 4
(a, R), (b, L) (a, R) 3
1 Accept!
(b, R) 2 2
3 q1 2
(b, R)
1 (a, L) 2
(b, R)
q2 0 4
(a, L) 4 5
3
(b, L) q3 4 A
3
(a, R) N
4
0 I
q4 3
(a) 2-way DFA M
5
q5 5 3

522
Algorithm 2-way FA 1-way FA
(a, R), (b, L) (a, R)
The figure below shows q0 when the
(b, R) 2 machine crosses each boundary first
3
(b, R) time to the right. The sequence 2, 4, 3 is
1 (a, L)
(b, R) identical to the one made my M.

(a, L) 4 5 a b a
(b, L) q0 1 2 3
(a, R) 4
1 Accept!
a b a q1 0 2 2
1 2
2 0 0 4
q2
3
3 3
4 q3 0 4
1 3
2 4
4 q4 0 0 3
5
5 3 3
Accept! 5
q5 0 3
5
523
Algorithm 2-way FA 1-way FA

Computing next CI
Let M be a 2-way DFA. Below is the algorithm which given an input symbol a on a
tape cell and CI c on the left cell boundary, computes CI c’ on the right cell boundary.
If M crosses the left boundary to the left in state q, the algorithm calls function
computeNextq0 (c, n, q, a) (see next page) and traces c to find the state in which the
machine crosses the right boundary c’ to the right first time.

Algorithm computeNextCI (c, c’, n, a) {


// input: CI in an integer array c[n] with values c[i]  {1, 2, . . ,n} which represents
// CI = <(1, c[1]), (2, c[2]), . . ., (n, c[n])>, and the transition function  of a 2-DFA.
// output: next CI in array c’[n].
c c'
for ( i = 1; i <= n; i++ ) { // Note that q is a state variable. a
let (q, d) =  ( i, a ); // q  {1, 2, . . , n}, d  {+1, -1}
if (d == +1 ) c’[i] = q; // M moves right in state q.
else let c’[i] = computeNextq0(c, n, q, a) ; // M moves left (i.e., d = -1) in state q
} // end for-loop
} // end algorithm

524
Algorithm 2-way FA 1-way FA

// M, reading a in state q, moves to the left.


// Array c[n] contains the CI on the left boundary of the cell where a is written.
// This algorithm traces c[n] and finds the state of M when it crosses the right
// boundary to the right first time. The set L is used to check if M keeps crossing
// the left boundary back and forth (i.e., enters a loop).
Algorithm computeNextq0 (c, n, q, a) // q  {1, 2, . . , n}, d  {+1, -1} c c'
{ let L = { } //empty set a
do {
if (c[q] == 0) then return 0; // M is in dead state (never comes back).
if (q  L ) then return 0; // M is in a looping state.
else put q in set L;
let (q, d) =  (c[q], a);
} while ( d == -1); // keep tracing CI c while d = -1
return q; // M is crossing the right boundary in state q;
}

525
Algorithm 2-way FA 1-way FA

Computing q0
Here is the algorithm which using computeNextCI and ComputeNextq0, finds the
sequence of states (denoted by q0 ) in which M first crosses the cell boundaries from
left to right first time. We assume that the transition function  and an input string are
given as global data.

let integer array c[n] = {0}; // initialize leftmost CI


define an integer array c’[n];
let q0 = 1; // 1 is the start state of 2-DFA M
do {
read next input into a; print (q0); // Output current value of q0.
computeNextCI (c, c’, n, a );
c c'
c = c’ ;
a
let (p, d) =  (q0, a);
if (d == +1) q0 = p;
else q0 = ComputeNextq0(c, n, p, a );
} forever;

526
Algorithm 2-way FA 1-way FA

The algorithm traces the sequence of states q0, while computing the CI on each cell
boundary moving right. Since the number of states n is finite, there are at most n2
different CIs, which is finite. It follows that the sequence of states q0 can be computed
by a 1-way FA with M and the algorithm A stored in its finite state control and the
same input written on the input tape as illustrated in figure (b) below.

....
....

M
M
A

(a) 2-DFA M (b) 1-DFA accepting L(M)

527
Algorithm 2-way FA 1-way FA
Let <q0, CI> denote the pair of CI and q0 on a cell boundary. Given a pair of
information <q0 , CI> on a cell boundary, M’ can compute <q0 , CI> on the next cell
boundary using the transition function of M and the algorithm A. Since M and the
algorithm A are fixed, the number of states of M’ is determined by the number of <q0 ,
CI> that is finite.
Let n be the number of sates of M. For each state i, there are no more than n different
crossing pairs <i, c[i]>, implying that there are at most (n)n different CI’s. Since there
can be at most n different q0, there are no more than (n+1)n+1 pairs of <q0 , CI>. This
number is finite. Let those CIs be named as C0, C1, C2, . . . , and the first CI be as
shown in Figure (a) for two input symbols a and b, then the state transition graph of M’
will appear as shown in Figure (b). ( Notice that 1 is the start state and C0 is the CI on
the leftmost cell boundary, where no q0.)
a
<p, Ci >
a b
C0
1 p 1 q
<q, Cj >
C0 Ci C0 Cj b

(a) CIs for the first input a and b (b) State transitions of 1-DFA M’
528
Algorithm 2-way FA 1-way FA

Since M is deterministic, and given a pair <q0, CI> and an input symbol, the algorithm
computes a unique succeeding pair, M’ is deterministic.
We can extend the algorithm to nondeterministic 2-way FA. Suppose that for an input
symbol a, M has some nondeterministic transitions as shown in figure (a) below. We let
the 1-way NFA M’ nondeterministically choose one of the transitions and apply the
algorithm for constructing the next CI as the following figures illustrate.

a a
(a, R) <s, Cj>
s r s
<r, Ci>
r Ci Cj
t <q, Ck >
a
(a, L) a
r
q
(a) 2-way NFA t (c) Transitions of 1-way NFA
Ci Ck
(b) Computing CI
529
Algorithm 2-way FA 1-way FA

Previous example is somewhat misleading, because the state transition graphs of the
two automata appear isomorphic. This is not true in general. Consider the 2-way NFA in
figure (a) below which has a self-looping transition on state t. The 1-way NFA
constructed by the algorithm may not necessarily have a self-loop on state <q, Ck>
because the pair <t, Cl > can be different from <q, Ck>.

a
<s, Cj>

(a, R) <r, Ci>


s
a <q, Ck >
r
t a
(a, L)
(a, R) <t, Cl >

(a) 2-FA transitions (b) 1-FA transitions

530
Algorithm 2-way FA 1-way FA

The 1-way FA model has many attracting properties, such as on-line (computes
while receiving the input string), real-time (takes exactly |x| steps for input string x),
easier to implement in hardware than the other model, etc.
On the contrary, the 2-way FA model is off-line and the computing time may take
longer than the input length. Because of the 2-way property, it is not easy to analyze
or manipulate them. For example, consider the 2-way FA shown below. Though the
automaton appears simple, it is not easy to figure out the language L(M) recognized
by the machine.

(a, R) (a, R)
(b, R) (b, R)
1 2 3 L(M) = (a+b)ba*
(a, L) (b, L)

531
Appendix C.
Computing Regular Expression for the
Language Recognized by an FA
a
b
1 2
a
a*b((a+b) + (a+b)a*b)*
a,b
3 

532
Computing Regular Expression for the
Language Recognized by an FA

In section 7.2, we learned how to compute a regular expression which expresses the
language recognized by a given FA. The idea is to eliminated the states one at a time
until we can read off a regular expression from the reduced transition graph that
denotes the language accepted by one of the accepting states of the FA. This is a
manual technique, which is not easy to implement into a program. Here we shall show
an algorithm known as the Cocke-Younger-Kasami (shortly CYK) algorithm,
developed based on the dynamic programming technique.

533
Computing Regular Expression

Let M = ( Q,  ,  , q0 , F ) be an FA, where Q = {1, 2, . . ., n } and q0 = 1.


We define the following notation.
• Rij(0) : a regular expression which denotes the set of the transition path labels
going from state i to state j (with no intermediate states).
• Rij(k) : a regular expression which denotes the set of transition path labels from
state i to state j going through the states whose id’s are less than or equal to k.

The figure below shows an example for computing Rij(0). Notice that for each
state i, there is an implicit self-loop with label . That is why every Rii(0) contains .

a
b
R11(0) = a+ R12(0) = b R13(0) = 
1 2
a R21(0) =  R22(0) =  R23(0) = 
a,b
3  R31(0) = a+b R32(0) = a R33(0) = b+

534
Computing Regular Expression

With Rij(0) computed, we can recursively compute Rij(k), for all k ≥ 1, using the
following recurrence. The figure below illustrates how this recurrence is derived.

Rij(k) = Rij(k -1) + Rik(k -1)(Rkk(k -1))*Rkj(k -1)

Rij(k-1)

i j

Rik(k-1) k Rkj(k-1)

Rkk(k-1)
Given the matrix of size n  n for Rij(k-1), we can compute Rij(k) using the above
recurrence. It follows that starting with Rij(0), we can iteratively compute Rij(n). From
this matrix for Rij(n) we collect R1f(n), for every f  F, combine them with the union
operator + and finally construct a regular expression which denotes the language
recognized by the FA. ( Recall that 1 is the start state, and hence, R1f(n) is a regular
expression which expresses the language recognized by the FA in state f  F.)
535
Computing Regular Expression
An example:
j
Rij (0)
1 2 3

i 1 a+ b  a
b
2   
1 2
3 a+b a b+ a
a,b
3 

Rij(1) = Rij(0) + Ri1(0)(R11(0))*R1j(0) b


Rij(1) 1 2 3
(a+)+(a+ )*(a+) b+(a+)(a+)*b =  +(a+ )(a+)* 
1 = a* b+a*b = a*b =

2  +  (a+ )*(a+) +  (a+)*b =  +  (a+)*  = 


=
(a+b)+(a+b)(a+)*(a+) a+(a+b)(a+)*b = (b+)+(a+b)(a+)* 
3 = (a+b)(+a*) = (a+b)a* a+(a+b)a*b = (b+)

536
An example Computing Regular Expression
j
Rij(1)
1 2 3
a
i 1 a* a*b  b
2    1
a 2
3 (a+b)a* a+(a+b)a*b b+ a,b
3 

Rij(2) = Rij(1) + Ri2(1)(R22(1))*R2j(1) b

Rij(2) 1 2 3
a*+(a*b)*  (a*b) +(a*b)*  + (a*b)*
1 = a* = b+a*b = a*b = a*b

2  + *  +* =  +* = 


=
(a+b)a*+ (a+(a+b)a*b)+ (b+)+
3 (a+(a+b)a*b)* (a+(a+b)a*b)* = (a+(a+b)a*b)* =
= (a+b)a* a+(a+b)a*b (a+b+)+(a+b)a*b

537
An example Computing Regular Expression
j
Rij(2)
1 2 3
i 1 a* a*b a*b
a
2    b
3 (a+b)a* a+(a+b)a*b (a+b+)+(a+b)a*b 1 2
a
a,b
3 
Rij(3) = Rij(2) + Ri3(2)(R33(2))*R3j(2)
Rij(3) b
1 2 3
R11(2) + R12(2) + R13(2) +
1 R13(2)(R33(2))*R31(2) R13(2)(R33(2))*R32(2) R13(2)(R33(2))*R33(2)
R21(2) + R22(2) + R23(2) +
2
R23(2)(R33(2))*R31(2) R23(2)(R33(2))*R32(2) R23(2)(R33(2))*R33(2)

3 R31(2) + R32(2) + R33(2) +


R33(2)(R33(2))*R31(2) R33(2)(R33(2))*R32(2) R33(2)(R33(2))*R33(2)

538
An example Computing Regular Expression

j
Rij(2) a
1 2 3
b
i 1 a* a*b a*b 1
a 2
2    a,b
3 
3 (a+b)a* a+(a+b)a*b (a+b+)+(a+b)a*b
b
To save space in the last matrix for Rij(3), we put the recurrences instead of the
regular expressions. Since state 3 is the only accepting state of the given FA, the entry
at R13(3) should contain a regular expression that denote the language. Using Rij(2)
above and the recurrence in R13(3) we can compute the regular expression as follows.
R13(3) = R13(2) + R13(2)(R33(2))*R33(2)
= (a*b) + (a*b)((a+b+)+(a+b)a*b)*((a+b+)+(a+b)a*b)

539
Computing Regular Expression

R13(3) = R13(2) + R13(2)(R33(2))*R33(2)


= (a*b) + (a*b)((a+b+)+(a+b)a*b)*((a+b+)+(a+b)a*b)

Even with the intermitent simplification of the regular expressions while computing
the recurrence, this final result still appears rather complex for the simple FA. Can we
further simplify this regular expression? Unfortunately, there is no algorithmic
approach for simplifying regular expressions. If we use the state elimination technique
given in Chapter 7 for the given FA, we can get a simpler expression as shown below.
Finally, we present an algorithm on the next slide, that can be easily understood
based on our discussion.
a a
b
b
1 2 1
a
Eliminate state 2
a,b
3  a+b 3
a+b
b
r3 = (r11)*r13(r33 + r31(r11)*r13)* = a*b((a+b) + (a+b)a*b)*

540
Computing Regular Expression
CYK Algorithm
//input: state transition graph of M, output: regular expression for L(M)
for (i = 1; i <= n; i++ )
for (j = 1; j <= n; j++ ) //compute R(0)[i][j]
if (i == j)
if ( there are m >= 0 labels a1, . . ., am on i to i loop edge)
R(0)[i][j] = “+a1+ . . .+am” ; // ““, if m = 0
else // (i  j )
if (there are m > 0 labels a1, . . ., am on i to j edge)
R(0)[i][j]= “a1+ . . .+am” ;
else R(0)[i][j]= “” ;
for ( k = 1; k <= n; k++ )
for (i = 1; i <= n; i++ )
for (j = 1; j <= n; j++ )
R(k)[i][j] = R(k-1)[i][j] + R(k-1)[i][k]( R(k-1)[k][k] )*R(k-1)[k][j] ;
output fF R(n)[1][f] ; // F is the set of accepting states

541
Appendix D. Properties of Deterministic
Context-free Languages

1. A CFL that cannot be recognized by any DPDA


2. Closure property of DCFL’s under complementation
3. Making a DPDA read the input up to the end of the input

542
Properties of DCFL's
1. A CFL that cannot be recognized by a DPDA
Let LCFL and LDCFL be, respectively, the classes of CFL’s and DCFL’s. The theorem
below shows that LDCFL  LCFL. In other words, it says that there is a CFL that cannot
be recognized by a DPDA, but by an NPDA. (Recall that, in contrast, every language
recognized by an NFA can also be recognized by a DFA.) This theorem can be proved
in two ways, which are both interesting.

Theorem 1. There is a CFL that cannot be recognized by a DPDA.


Proof (Non-constructive). The complement of every DCFL is also a DCFL. (We will
show this by the proof of Theorem 2 below.) In Chapter 9, we showed a CFL whose
complement is not CFL, which implies the theorem. •

543
Normal Form of PDA Properties of DCFL’s

We need the following lemma for the constructive proof of Theorem 1. This lemma,
which simplifies the PDA model, will also be used for the proof of Theorem 2.

Lemma 1 (Normal form of PDA). Every CFL can be recognized by a PDA which
satisfies the following conditions.

(1) The PDA never empties the stack (i.e., it does not pop Z0 ),
(2) when pushes, the machine pushes exactly one symbol, and
(3) never changes the stack-top.

Proof. Let M = (Q, , , , q0, Z0, F) be a PDA, where p, q  Q, A, B   and a 


 {}. Notice that conditions (2) and (3) does not allow a pushing move, like (p, a, A)
= (q, BC), where the original stack-top A is changed to C. This normal form applies all
PDA’s, either deterministic or nondeteriministic. In the rumination section at the end of
Chapter 4, we showed that condition (2) does not affect the language recognized by a
PDA. Here we show that the lemma is true for conditions (1) and (3).

544
Normal Form of PDA Properties of DCFL’s

Suppose that a PDA M does not satisfy condition (1) and has a move which pops
the bottom of the stack symbol Z0 as shown in figure (a) below. Since, with the stack
empty, the PDA cannot have any move, we can simply let it push a new stack symbol,
say X0 , on top of Z0 instead of popping it as shown in figure (b). This modified PDA
M’ recognizes the same language.

(. , Z0 / ) (. , z0 / X0Z0 )

start start
(. , Z0 / ) (. , z0 / X0Z0 )

(a) PDA M (b) PDA M'

545
Normal Form of PDA Properties of DCFL's

Now, suppose that PDA M satisfies conditions (1) and (2), except for condition (3).
We convert M to M’ such that M’ keeps the stack-top symbol of M in its finite state
control and simulates M as illustrated in the following figure. Notice that when the
stack of M is empty, M’ keeps a copy of Z0 in its finite state control. PDA M’ never
rewrites its stack top and recognizes the same language. •(By keeping the stack top in
the finite state control, we are increasing the number of states of the PDA.)

(a) M A
Z0 Z0A .. BA .. BC .. BDA N
I

(b) M' Z0 A A C A

Z0 Z0 .. B .. B .. BD
546
Properties of DCFL's
Proof of Theorem 1

Proof of Theorem 1 (constructive). Now, we will show that no DPDA


recognizes the palindrome language L = { wwR | w  {a, b}+ }. (This language
is context-free, because we can easily construct a CFG, or a NPDA as shown in
Section 5.2)
To the contrary, suppose that there is a DPDA M which recognizes L. Let qx
and Ax be, respectively, the state of M and a stack-top symbol, when M has read
the input up to some prefix x of the input string. M may read additional input
string segment z before it pops Ax (see the figure below). By [qx, Ax], we shall
denote such a pair.

x x z

[qx, Ax] qx t

Z0 Ax Z0 Ax

547
Properties of DCFL's
Proof of Theorem 1

Since M is a DPDA, for a given string x, there exists a unique pair [qx, Ax]. For the
proof of the theorem, we will first show that if M recognizes the palindrome language
L, there are two different strings x and y for which [qx, Ax] = [qy, Ay]. For such strings
x and y, we can easily find a string z such that xz  L and yz  L.
Let’s examine what will happen for M with input strings xz and yz. When the
machine reads up to x and y, it enters in the same state (i.e., qx = qy ) with the same
stack-top (i.e., Ax= Ay ), and never pops it while reading the remaining part z of the
input. It follows that M should either accept both xz and yz or both not. We are in a
contradiction because M is a DPDA.

x x z

[qx, Ax] qx t

Z0 Ax Z0 Ax

548
Proof of Theorem 1 Properties of DCFL's

For an arbitrary input string u  {a, b}+, let u be the content of the stack when M
has read up to the last symbol of string u (figure (a)). Let v {a, b}* be a string such
that given uv as an input, the machine reduces u to its minimum (uv in figure (b)) by
the time when M reads the last symbol of string uv.
u u v

quv

u uv
Z0 Z0

(a) (b)

In other words, v is a string which appended to u and given uv as an input, M


reduces the stack height |u|-|uv| to its minimum. Thus after processing input u, no
other string v’ appended to u and given as an input, M never pops the content of uv.
Notice that depending on u, there can be more than one v that minimizes uv. In
special case, it is also possible to have v = , i.e., u, = uv. Clearly, for every string u,
there exists such a string v.
549
Proof of Theorem 1 Properties of DCFL's

u u v

quv

u uv
Z0 Z0

(a) (b)

Let [quv, Auv] be the pair of state and the stack-top symbol (of uv ) when M reads the last
symbol of the input string uv as figure (b) above illustrates, and define the following sets
S and T.
S = { [quv, Auv] | u  {a, b}+, and v {a, b}* that gives the shortest |uv| }
T = { uv | u  {a, b}+, and v {a, b}* that gives the shortest |uv| }
Since the number of states of M and the size of the stack alphabet is finite, so is the set
S. However, T is infinite, because it contains uv for every string u  {a, b}+. Clearly, for
every string x  T there exists a pair [qx, Ax]  S. It follows that for two distinct strings x,
y  T, there must be one pair [qx, Ax] = [qy, Ay] in S.
550
Properties of DCFL's
Proof of Theorem 1

Now, with the two strings x, y  T for which [qx, Ax] = [qy, Ay], we find a string z
such that xz  L and yz L as follows.
(1) If |x| = |y|, we let z = xR. Then clearly, xz = xxR  L and yz = yxR  L.
(2) If |x|  |y|, we construct z as follows: Suppose that |x| < |y|. (The same logic
applies when it is assumed the other way.) Let y1 be the prefix of y such that |y1| = |x|,
and let y = y1y2. Find a string w such that |w| = |y2| and w  y2 and construct string z
= wwRxR . Clearly, xz = xwwRxR  L and yz = y1y2wwRxR  L. (Notice that
because of the three conditions |y1| = |x|, |w| = |y2| and w  y2 , string yz does not have
the palindrome property of L.)
Now, let’s examine what will happen with the DPDA M for two input strings xz and
yz. We know that for the two input strings, [qx, Ax] = [qy, Ay], which implies that M
must either accept both xz and yz, or both not, because the DPDA is computing with
the same input z starting with the same state qx ( = qy) and the same stack top Ax ( =
Ay). This contradicts the supposition that M is a DPDA. It follows that L is not a CFL
recognizable by a DPDA.

551
Properties of DCFL's

2. Properties of DCFL's
Theorem 2. Let L1 and L2 be arbitrary DCFL’s, and let R be a regular language.
(1) L1  R is also DCFL.
(2) The complement of L1 is also DCFL.
(3) L1  L2 and L1  L2 are not necessarily a DCFL. In other words, DCFL’s are not
closed under union and intersection.
Proof. We assume that every DFA and DPDA read the input string up to the last
symbol without rejecting the input in the middle. It needs a long and complex proof to
show that we can take such assumption. We shall defer this part of the proof toward
the end of this appendix.

552
Properties of DCFL's
Proof

Proof (1). L1  R is also a DCFL.


Let M and A be, respectively, a DPDA and a DFA which recognizes L1 and R. With
the two automata M and A, we construct a DPDA M' which recognizes the language
L1  R as follows. With the transition functions of M and A in its finite state control,
the DPDA M' simulates both M and A, keeping track of their states. M' simulates A
only when M reads the input. (Recall that PDA’s can have an -move, in which they do
not read the input.) DPDA M' enters an accepting state if and only if both M and A
simultaneously enter an accepting state. Since M and A both read the input up to the
end of the input, clearly, M' recognizes the language L1  R.

A
M' M

553
Properties of DCFL's
Proof

Proof(2) The complement of L1 is also DCFL.


Let M be a DPDA recognizing L1. Unfortunately, we cannot use the simple
technique of converting the accepting states to non-accepting states, and vice versa,
that we used to prove the closure property of regular languages under
complementation. Let’s see why.
Suppose that M takes a sequence of -moves, where the machine computes only
with the stack, without reading the input as illustrated in the following figure. (In the
figure, the heavy circle denotes an accepting state.) If the input symbol a, read by the
machine entering state p, is the last input symbol, then the input string will be
accepted, because it enters an accepting state after (not necessarily right after) reading
the last input symbol.

(, ./..) (, ./..) (, ./..) (b, ./..)


p
(a, ./..)

554
Properties of DCFL's
Proof

Let’s see what will happen, if we convert the accepting state to non-accepting state,
and vice versa as shown below (figure (b)). Still the machine accepts the input,
because it enters an accepting state after reading the last symbol a. To solve this
problem, we will use a simulation technique.

(, ./..) (, ./..) (, ./..) (b, ./..)


(a, ./..) p

(a)

(, ./..) (, ./..) (, ./..) (b, ./..)


(a, ./..) p

(b)

555
Properties of DCFL's
Proof
We construct a DPDA M' which simulates M to recognizes the complement of L(M)
as follows. M' keeps the transition function of M in its finite state control and uses its
own stack for M. The simulation is carried out in two cases (a) and (b), depending on
whether M enters an accepting state between two moves of reading the input.
(a) If M, after reading an input symbol (a in the figure), does not enter an accepting
state till it reads the next input symbol (b in the figure), M' reads the input a and enters
an accepting state, and then simulates M reading the next input symbol b. (The -
transitions in between are ignored.) Notice that, M' is simulating M to recognize the
complement of L(M). If the symbol a that M reads is the last one from input string x, it
will not be accepted by M. So, to have this input string x accepted by M', we let it enter
an accepting state right after reading the symbol a.
(b) If M ever enters an accepting state in between two reading moves (i.e., non- -
transitions), M' simulates the two reading moves of M without entering an accepting
state.
(, ./..) (, ./..) (, ./..) (b, ./..)
p q
(a, ./..)

556
Properties of DCFL's
Proof

Proof (3). L1  L2 and L1  L2 are not necessarily a DCFL.


As shown in the following page, L1 and L2 below are DCFL’s. However, in Section
12.4 we proved that the intersection L1  L2 = {aibici | i  1} is a CSL which is not
context-free.
L1 = {aibick | i, k  1 } L2 = {aibkck | i, k  1 }
If the union L = L1  L2 is a DCFL, the following language L' must be a DCFL
according to property (1) because {aibjck | i, j, k  1 } is regular (see a DFA recognizing
this language in the following page).

L' = L  {aibjck | i, j, k  1 } = {aibici | i  1 )


However, we know that {aibici | i  1 } is not context-free (see Section 12.5).

557
Properties of DCFL's
Proof

(a, Z0/aZ0), (a, Z0/Z0),


(a, a/aa) (b, Z0/bZ0),(b, b/bb)
(b, a/)
(a, Z0/Z0)
(b, a/)
(c, b/)
(c, Z0/Z0)
(, Z0/Z0)

(c, b/)
(c, Z0/Z0)
(b) DPDA accepting
(a) DPDA accepting L2 = {aibkck | i, k  1 }
L1 = {aibick | i, k  1 }

a b c
a b c

(c) DFA accepting {aibjck | i, j, k  1 }


558
Properties of DCFL's

3. Making every DPDA and DFA read up to the last input symbol
In this section we shall prove the following lemma which we have deferred while
proving Theorem 2.
Lemma 2. Every DCFL and regular language can be recognized, respectively, by a
DPDA and a DFA which read up to the last input symbol.
Proof. According to the convention, when we define the transition function (or the
transition graph) of an automaton, we may not define it for every possible tape symbol
(and the stack-top for a PDA). We assume that entering a state from which no transition
defined, the automaton rejects the input immediately without reading the input further).
To make a DFA read up to the last input symbol, we explicitly introduce a dead state
and let the machine enter it for every undefined transition. Then we let it read off all the
remaining input symbols in the dead state as the following example shows. (See also
Section 9.1.)
a b
A
start a N
b d a, b I
a, b
559
Making DPDA read the last input symbol Properties of DCFL's

Let M = (Q, , , , q0, Z0, F) be a DPDA. Recall that for every p Q, a   and
A  , both (p, a, A) and (p, , A) give at most one value and if (p, a, A) is
defined (p, , A) is not defined, and vice versa.
The problem of making M read up to the last input symbol is not that simple. The
automata may hit an undefined transition as for DFA’s or end up in a cycle of -moves
in the middle of the computation without consuming the input. For the case of
undefined transitions, we can use the same approach as for DFA’s. Here is an example.
(Notice that this DPDA accepts the language {aibi | i  1}.)

( a, A/AA ) ( b, A/ )
(a, Z0 /AZ0) ( b, A/ ) (, Z0 /Z0) A
1 2 3  = {a, b}
4 N
start  = {A, Z0}
( a, A /A ) I
(b, Z0 /Z0) (X, Z0 /Z0) X  {a, b}
d
Y  {A, Z0}
(X, Y/Y )

560
Properties of DCFL's
Making DPDA read the last input symbol

Now, we study the problem of converting M to a DPDA M’ which consumes the


input without entering a cycle of -moves. The figure below shows a part of the
transition graph of M with such loop.
Notice that entering the loop the machine does not necessarily stay in it. Depending
on the stack contents it may exit. Given a state transition graph of M, our objective is
to find a state q and the stack top symbol Z such that the machine cyclically enters q
with the same stack top Z, that is, (q, , Z) |-* (q, , (Z)*Z) for ,   *.

(, A/CA) (, D/ED)


(a, Z0/AZ0)
1 2
(, E/ ),
(, B/)
(, C/ )
(, B/B)
4 3
(, D/ )
(b, A/BA) (, A/BA)

561
Properties of DCFL's
Making DPDA read the last input symbol

Since the graph is finite and the transitions are deterministic, we can effectively
identify every entering transition which remains in the cycle, detach it from the
cycle and let it enter the dead state, where the remaining input is consumed. If the
cycle involves an accepting state, we let the transition enter an accepting state
before sending it to the dead state.

(, A/CA) (, D/ED)


(a, Z0/AZ0)
X 1 2
Y
(, E/ ),
(, B/)
(X, Y/Y ) d (, C/ )
(, B/B) 4 3
(, D/ )
(b, A/BA) (, A/BA)

562
Properties of DCFL's
Making DPDA read the last input symbol

For more detailed approach to the conversion, refer J. Hopcroft and J. Ullman,
“Introduction to Automata Theory, Languages and Computation, Section 10.2,”
Addison Wesley, 1979, or M. Harrison, “Introduction to Formal Language
Theory, Section 5.6,” Addison Wesley, 1978.

563
Appendix E. A CFL Satisfying the Pumping
Lemma for Regular Languages
We showed that every infinite regular language satisfies the pumping lemma. We
proved this lemma with no regard to other classes of languages. As an application of
the lemma we showed that context-free language {aibi | i > 0 } does not satisfy the
lemma and hence, is not regular.
We may ask the following. Is every non-regular context-free language does not
satisfy the lemma? Or otherwise, is there a non-regular language which satisfies the
pumping lemma? Here we show a non-regular context-free language that satisfies the
pumping lemma, giving positive answer for the later question. (This example is given
by one of the author’s colleagues Professor Robert McNaughton at RPI.)

L1 = {x | x  {a, b}*, x has aa or bb as a substring.}


L2 = {x | x  {a, b}*, |x| = p, p is a prime number }
L1 = {a, b}* - L1 // complement of L1
L = ( L 1  L 2 )  L1 // L is a non-regular language satisfying the lemma.

564
Proof CFL Satisfying the Pumping Lemma

We will prove why L satisfies the lemma. L1 is regular. We can prove this by the
fact that L1 can be expressed by a regular expression or the complement of L1 can be
recognized by an NFA shown below, because regular languages are closed under
complementation.
a,b
a,b
a
a
(ab)* + b(ab)* + (ab)*a + (ba)*
b
b
a,b

L1 = {x | x  {a, b}*, x has no aa or bb as a substring.}


L2 = {x | x  {a, b}*, |x| = p, p is a prime number }
L1 = {a, b}* - L1 // complement of L1
L = ( L1  L2 )  L1 // L is a non-regular language satisfying the lemma.

565
Proof CFL Satisfying the Pumping Lemma

We will first show that L is not regular. Suppose L is regular. According to the
properties of regular languages, since L1 is regular, L - L1 = L1  L2 must also be
regular. Since this language is infinite, it should satisfy the pumping lemma. Let n be
the constant of the pumping lemma, and choose a string z in L1  L2 whose length is
greater than n. Let z = uvw such that |uv|  n and |v| ≥ 1.
Now, we pump z and make z’ = uvpj+1w, where j = |v| and p = |z| is a prime number
according to the condition of L2 . Clearly, the length of z’ is |z’| = p + |v|pj = p(1 + j2 ),
which is not a prime number. It follows that z’  L1  L2 . The language L1  L2 is
not regular, implying that L is not regular.

L1 = {x | x  {a, b}*, x has no aa or bb as a substring.}


L2 = {x | x  {a, b}*, |x| = p, p is a prime number }
L1 = {a, b}* - L1 // complement of L1
L = ( L1  L2 )  L1 // L is a non-regular language satisfying the lemma.

566
Proof CFL Satisfying the Pumping Lemma

L1 = {x | x  {a, b}*, x has no aa or bb as a substring.}


L2 = {x | x  {a, b}*, |x| = p, p is a prime number }
L1 = {a, b}* - L1 // complement of L1
L = ( L1  L2 )  L1 // L is a non-regular language satisfying the lemma.

Now, we will prove that language L satisfies the pumping lemma. For the proof, we
must show that there exists a constant n such that for every string z  L whose length
is greater than or equal to n, there exists strings u, v and w that satisfy the following
conditions.
(i) z = uvw (ii) |uv|  n (iii) |v| ≥ 1
(iv) For all i ≥ 0, uviw  L
Let z = c1c2c3x be a string in L, where ci  {a, b}, x  {a, b}*, |x| ≥ n - 3.

567
Proof CFL Satisfying the Pumping Lemma

L1 = {x | x  {a, b}*, x has no aa or bb as a substring.}


L2 = {x | x  {a, b}*, |x| = p, p is a prime number }
L1 = {a, b}* - L1 // complement of L1
L = ( L1  L2 )  L1 // L is a non-regular language satisfying the lemma.

Let n = 3, and consider the following 3 cases.


Case 1: c1  c2  c3 (i.e., z = abax, or z = babx). Let u = c1, v = c2 and w = c3x. Then,
clearly, for every i  0, z' = uviw  L. ( Specifically, if i = 1, then z'  L. Otherwise, z'
 L1.)
Case 2: c1 = c2 (i.e., z  L1). Let u = c1c2, v = c3 and w = x. Then for all i  0, we have
z' = uviw  L (specifically, z'  L1).
Case 3: c2 = c3 (z  L1 ). Let u = , v = c1 and w = x. Then again, for all i  0, we
have z' = uviw  L.
It follows that language L satisfies the pumping lemma.

568
Appendix F. CYK Algorithm for the
Membership Test for CFL
The membership test problem for context-free languages is, for a given arbitrary CFG
G, to decide whether a string w is in the language L(G) or not. If it is, the problem
commonly requires a sequence of rules applied to derive w. A brute force technique is
to generate all possible parse trees yielding a string of length |w|, and check if there is
any tree yielding w. This approach takes too much time to be practical.
Here we will present the well-known CYK algorithm (for Cocke, Younger and
Kasami, who first developed it). This algorithm, which takes O(n3) time, is based on
the dynamic programming technique. The algorithm assumes that the given CFG is in
the Chomsky normal form (CNF).
Let w = a1a2 . . . . an, wij = aiai+1 . . . aj and wii = ai . Let Vij be the set of nonterminal
symbols that can derive the string wij , i.e.,

Vij = { A | A * wij , A is a nonterminal symbol of G}

569
CYK Algorithm

wij = ai ..... aj

Construct an upper triangular matrix whose


entries are Vij as shown below. In the matrix, j
corresponds to the position of input symbol, and
Vij i corresponds to the diagonal number.

j
w = a1 a2 a3 a4 a5 a6
V11 V22 V33 V44 V55 V66
V12 V23 V34 V45 V56
V13 V24 V35 V46
V14 V25 V36 Clearly, by definition if
i S  V16 , then string
V15 V26 w  L(G).
V16

570
CYK Algorithm

The entries Vij can be computed with the entries in the i-th diagonal and those in the
j-th column, going along the direction indicated by the two arrows in the following
figure. If A Vii (which implies A can derive ai ), B V(i+1)j (implying B can derive
ai+1. . . aj ) and C  AB, then put C in the set Vij . If D Vi(i+1) (which implies D can
derive aiai+1), E V(i+2)j (implying E can derive ai+2. . . aj ) and F  DE, then put F in
the set Vij , and so on.

wij = ai ai+1 ai+2 . . . . . aj

Vii Vjj A
Vi(i+1) N
I

V(i+2)j
Vi(j-1) V(i+1)j
Vij

571
CYK Algorithm

For example, the set V25 is computed as follows.

w = a1 a2 a3 a4 a5 a6
V11 V22 V33 V44 V55 V66
V12 V23 V34 V45 V56
V13 V24 V35 V46
A
V14 V25 V36 N
Let A, B and C be nonterminals of G. V15 V26 I
V25 = { A | B  V22 , C  V35 , and A  BC }
V16
 { B | C  V23 , A  V45 , and B  CA }
 { C | B  V24 , A  V55 , and C  BA }
.....
(Recall that G is in CNF.)

572
CYK Algorithm

In general, Vij =  { A | B  Vik , C  V(k+1) j and A  BC }


i  k  j-1

wij = ai ai+1 . . . . . aj

Vii Vjj
Vi(i+1)

V(i+2)j
Vi(j-1) V(i+1)j
Vij

573
CYK Algorithm
Example:
w = a a a a b b

{A, D} {A,D} {A,D} {A,D} {B} {B}

{D} {D} {D} {S,C}


CFG G S  AB {}
D  AD D  AD D  AD
C  DB
S  aSb | aDb
{S,C} {B}
D  aD | a {D} {D} S  AC
B  SB
C  DB

{S,B,C}
{D} {S,C} SAB,CDB
B  SB

{S,B,C}
CNF CFG {S,C} SAC,CDB
B  SB
S  AB | AC A  a B  SB Bb
{S,B,C}
C  DB D  AD | a SAB,S AC
CDB, BSB

Since S  V16 , we have w  L(G).


574
CYK Algorithm

Here is a pseudo code for the w = a1 a2 a3 a4 a5 a6


algorithm.
V11 V22 V33 V44 V55 V66
//initially all sets Vij are empty V12 V23 V34 V45 V56
// Input x = a1a2 . . . . an.
V13 V24 V35 V46
for ( i = 1; i <= n; i ++ )
Vii = { A | A  ai }; V14 V25 V36
for ( j = 2; j <= n; j++ ) V15 V26
for ( i = j-1; i =1; i-- )
V16
for ( k = i; k <= j-1; k++)
vij = vij  { A | B  Vik , C  V(k+1) j and A  BC };
if ( S  Vin ) output “yes”; else output “no”;

The number of sets Vij is O(n2), and it takes O(n) steps to compute each vij. Thus the
time complexity of the algorithm is O(n3).

575
References
for Further Reading

A. V. Aho and J. D. Ullman, The Theory of Parsing, Translations, and compiling, Vol. 1: Parsing, Englewood
Cliffs, N. J.: Prentice-Hall, 1972.

P. J. Denning, J. B. Dennis, and J. E. Qualitz, Machine, Languages, and Computation, Englewood Cliffs, N. J.:
Prentice Hall, 1978

H. Ehrig, M. Nagl, G. Rozenberg, and A. Rosenbeld (Edition), Graph Grammars and Their Application to
Computer Science, Lecture Notes in Computer Science #291, Springe-Verlag, 1986.

M. A. Harrison, Introduction to Formal Language Theory, Reading, Mass.: Addison-Wesley, 1978.

J. E. Hopcroft, R. Motwani, and J. D. Ullman, Introduction to Automata Theory, Languages, and Computation,
Second Ed., Addison-Wesley, 2001.

P. Linz, An Introduction to Formal Languages and Automata, Third Ed., Jones and Bartlett Publishers, 2001.

D. W. Mount, Bioinformatics: Sequence and Genome Analysis, Cold Spring Harbor Lab. Press, 2001.

F. P. Preparata and R. T. Yeh, Introduction to Discrete Structures for Computer Science and Engineering, Reading
Mass.: Addison-Wesley, 1973.

S. Wolfram, Theory and Application of Cellular Automata, World Scientific, 1986.

576
Index
2-D tape TM , 165 Cellular automata, 179
2-dimensional(2D)-tape FA , 177 Characterization, 189, 191, 471
2-head FA, 178 Chomsky hierarchy, 185, 187, 190, 309, 310, 435
2-sided L-system , 62 Chomsky normal from (CNF), 277, 278
2-way FA , 176, 515 Church's hypothesis, 181
2-way PDA , 169 Closure, 5
Accept, 81 Closure property, 248
Accepting by empty stack , 172 Closure property of DC리, 543
Accepting configuration , 370, 383 Coding region, 430
Accepting state , 94 Codon , 428, 430
Algebraic properties, 73 Commutative law, 73
Ambiguous CFG , 294 Complement , 240
Amino acid , 428 Complementary strand, 426
Associative law, 73 Composite symbol, 3, 480
Associativity, 301 Computational capability, 160
Axiom, 63 Computing regular expression , 532
Base, 15 Concatenate, 2
Bottom of the stack symbol, 101, 104 Conditional connective, 22
Bottom-up left-to-right traversal, 356 Configuration, 360
Bottom-up parsing, 379 Constructive proof, 17
CA, 179 Containment, 185, 188, 190, 309, 310
Cell, 81 Context-free grammar, 51

577
Index

Context-sensitive grammar, 51 Equivalent state, 224


Countable, 438 Exclusive OR, 4
Crossing information (CI), 518 Existential quantification, 13, 346
Crossing state pair, 518 Extensible markup language (XML), 414, 418
CYK algorithm, 352, 533, 569 FA array, 179
Declarative statement, 6 Finite state control, 81
DeMorgan's law, 4 Formal language, 30, 31, 48
Deterministic finite automata (DFA), 111 Front nonterminal, 280
Deterministic linear bounded automata (DLBA), 98 Front recursion, 280
Deterministic L-system, 62 Gene code, 425
Deterministic pushdown automata (DPDA), 100 Gene grammar, 425
Deterministic Turing machine (DTM), 81 Genome, 425
Diagonalization, 21, 453, 465 Grammar, 30, 33
Distributive law, 73 Greibach normal form (GNF), 277, 280
Document type definition (DTD), 414 Hidden Markov model (HMM), 180
Encoding function, 444 Homomorphism, 444
Enumerable set, 437 Hyper text markup language (HTML), 414, 415
Enumerable languages, 450 Hypothesis, 15
Enumeration, 437 Implication, 9
Enumerator, 437 Induction, 15
epsilon-move, 102 Induction hypothesis, 15
epsilon-transition, 146 Inherently ambiguous, 305

578
Index

Input alphabet, 94 Minimize, 224, 226


Intersection, 240 Moore automata, 161
Intron, 427, 430 Morphemes, 49
Kleene closure, 5 Move, 81
Kleene star, 240 Multi-head FA , 178
Language class, 309 Multi-stack PDA, 170
Left associativity, 304 Multi-tape TM, 164
Left linear, 57 Multi-track TM, 162
Leftmost derivation, 354 NFA, 126
Lex, 352, 404 Non-constructive proof, 17
Lexical ambiguity, 288 Nondeterministic algorithm, 143
Lexical analyzer, 404 Nondeterministic automata, 124. 154
Linear, 57 Nondeterministic choose, 141
LL(k) grammar, 365, 378 Nondeterministic computing, 140
Look ahead, 359, 368 Nondeterministic finite automata (NFA), 126
Look ahead LR parser (LALR), 406 Nondeterministic linear bounded automata (NLBA), 139
Look-ahead, 368, 373 Nondeterministic pushdown automata (NPDA), 132
LR(k) grammar, 386, 403 Nondeterministic Turing machine (NTM), 139
LR(k) parser, 379 Non-front terminal, 280
L-system, 62 Nonterminal alphabet, 49
Lyndenmayer, 62 Normal from, 277
Membership test for CFL, 569 Normal form of PDA, 544

579
Index

Null string, 2 Proof by pigeonhole principle, 18


Ogden's lemma, 336 Proof technique, 10
Parallel rewriting, 56, 62 Proper containment, 309, 312
Parse table, 366, 385 Property list, 3
Parse tree, 290, 354 Proposition, 6
Parsing, 352 Pumping lemma, 316, 322, 564
Pascal syntax flow graph, 508 Pumping lemma for CFL, 328
Path label, 317 Purine, 431
Phrase structured grammar, 49 Push, 100
Pigeonhole principle, 18 Pushdown stack, 100
Pop, 100 Pyrimidine, 431
Power set, 5 Read only, 100
Precedence, 293, 301 Recursive language, 460
Product, 240 Recursive TM, 459, 460
Production rule, 54 Recursively enumerable (R.E.), 50, 446
Proof by cases, 11 Reduced DFA, 229
Proof by contradiction, 12 Reduction table, 385
Proof by contrapositive, 11 Regular expression, 70
Proof by counting, 20 Regular grammar, 52
Proof by example, 13 Reversal, 240
Proof by generalization, 14 Rewriting rule, 33, 54
Proof by induction, 15 Ribosome, 427

580
Index

Right associativity, 304 Stack-top erasing set, 489


Right linear, 57 Start codon, 428, 430
Rightmost derivation, 354 Start state, 94
RNA transcription, 426 Start symbol, 49
Rule, 31, 33 State, 81
Semantic ambiguity, 288 State minimization, 224
Sentential form, 50 State partitioning technique, 226
Set complement, 4 State transition function, 82, 92, 94
Set intersection, 4 State transition graph, 82, 90
Set operation, 4 State transition profile, 126
Set product, 4 State transition table, 93
Set properties, 3 Stop codon, 428, 430
Set specification, 3 String, 2
Set subtraction, 4, 240 Sum-of-subset problem, 142
Set union, 4 Symbol, 2
Shift-in, 394, 400 Syntactic ambiguity, 288
Simulate, 448 Syntactic category, 49
Splicing, 427 Syntax, 290
Spontaneous transition, 146 Syntax diagram, 67
Stack alphabet, 104 Syntax flow graph, 67
Stack head, 101 Tape alphabet, 94
Stack-top, 100 Tape head, 81

581
Index

Terminal alphabet, 49 User-written code, 405, 408


Token definition, 404 Variable, 40
Token description, 404 YACC, 352, 406
Yield, 291
Top-down left-to-right traversal, 355
Total alphabet, 49
Transcribe, 426
Transcript, 429
Transducer, 161
Transition path, 317
Translation, 427
Truth table, 6
Turing machine, 80
Type 0 grammar, 49
Type 1 grammar, 51
Type 2 grammar, 51
Type 3 grammar, 52
Union, 240
Unit production rule, 262, 264
Universal quantification, 13, 346
Universal set, 4
Universal Turing machine, 448
Useless symbol, 266

582

Вам также может понравиться