Unit II Regular Expression

THEORY OF COMPUTATION
UNIT II:
Regular Expressions
Jagadish Kashinath Kamble (M.Tech, IIT Kharagpur)

Assistant Professor,
Dept. of Information Technology,
Pune Institute of Computer Technology,
Pune. 1
Regular
Expression
Representator
Regular Generator Acceptor Finite

Regular Language
Grammer Automata
Right Linear Grammer

FA with Output
Left Linear Grammer FA without Output

Regular Expressions
Regular expressions
describe regular languages
Example: (a + b  c) *
describes the language

a, bc* =  , a, bc, aa, abc, bca,...
Recursive Definition
1) + : Union 2) . : Concatenation 3) * : Kleene Closure
Primitive regular expressions: ,  , 

 {}
 { }
r1 + r2  {a}
Given
regular r1  r2
expressions Are regular expressions
r1 and r2 r1 *
(r1 )
Examples
A regular expression: (a + b  c ) * (c + )
Not a regular expression: (a + b + )

Languages of Regular Expressions
L (r ) : language of regular expression r
Example
L((a + b  c) *) =  , a, bc, aa, abc, bca,...

Definition
For primitive regular expressions:
L( ) = 
L( ) =  
L(a ) = a
Definition (continued)
For regular expressions r1 and r2
L(r1 + r2 ) = L(r1 )  L(r2 )
L(r1  r2 ) = L(r1 ) L(r2 )
L(r1 *) = ( L(r1 )) *
L((r1 )) = L(r1 )
Example
Regular expression: (a + b )  a *
L((a + b )  a *) = L((a + b )) L(a *)
= L(a + b ) L(a *)
= ( L(a )  L(b )) ( L(a )) *
= (a  b) (a) *
= a, b , a, aa, aaa,...
= a, aa, aaa,..., b, ba, baa,...
Example
Regular expression r = (a + b ) * (a + bb )
L(r ) = a, bb, aa, abb, ba, bbb,...

Example
Regular expression r = (aa ) * (bb ) * b
L(r ) = {a b
2n 2m
b : n, m  0}
Equivalent Regular Expressions
Definition:
Regular expressions r1 and r2
are equivalent if L(r1) = L(r2 )

Regular Expressions
and
Regular Languages
• Regular Language • Regular Expression
• ∅ • ∅
• {ε} • {}
• {a,b}* • (a+b)*
• {aab}*{a,ab} • (aab)*(a+ab)
• {aa,bb}U • (aa+bb+(ab+ba)(aa+b
{ab,ba}{aa,bb}* b)*)
Theorem
Languages
Generated by
Regular Expressions
= Regular
Languages
Find RE for the Language of strings of Length 2 over the
alphabet {a,b}
Find RE for the Language of strings of Length at least 2
over the alphabet {a,b}
Find RE for the Language of strings of Length at most 2
Find RE for the Language of strings of even Length over the
alphabet {a,b}
Find RE for the Language of strings of odd Length over the
alphabet {a,b}
Find RE for the Language of strings which is divisible by 3
Find RE for the Language of strings ~= 2 mod 3 over the
alphabet {a,b}
Find RE for the Language of strings of a’s exactly 2 over the
alphabet {a,b}
• Find RE for the Language of strings of a’s atleast 2 over the
alphabet {a,b}
• Find RE for the Language of strings of a’s atmost 2 over the
alphabet {a,b}
• Find RE for the Language of strings of even length a’s over
the alphabet {a,b}
• Find RE for the Language of strings which starts with a over
the alphabet {a,b}
• Find RE for the Language of strings which ends with a over
the alphabet {a,b}
• Find RE for the Language of strings which contains a over
the alphabet {a,b}
• Find RE for the Language of strings which starts and ends
with different symbol over the alphabet {a,b}
• Find RE for the Language of strings which starts and ends
with Same symbol over the alphabet {a,b}
• Give RE for Following over the alphabet {0,1}
(Aug-2015 6 Marks)
– All binary strings with at least one 0
– All binary strings with at Most one 0
• Language of all strings containing substring 00
• The Language of Strings in {a, b}∗ Ending with b
and Not Containing aa
• Find RE for the Language of strings which does
not contains two a’s together over the alphabet
{a,b}
Review
• Regular Expression
Identities Related to Regular Expressions
• Given R, P, L, Q as regular expressions, the following
identities hold −
• ∅* = ε
• ε* = ε
• RR* = R*R
• R*R* = R*
• (R*)* = R*
• RR* = R*R
• (PQ)*P =P(QP)*
• (a+b)* = (a*b*)* = (a*+b*)* = (a+b*)* = a*(ba*)*
• R + ∅ = ∅ + R = R (The identity for union)
• R ε = ε R = R (The identity for concatenation)
• ∅ L = L ∅ = ∅ (The annihilator for
concatenation)
• R + R = R (Idempotent law)
• L (M + N) = LM + LN (Left distributive law)
• (M + N) L = ML + NL (Right distributive law)
• ε + RR* = ε + R*R = R*
Theorem
Languages
Generated by
Regular Expressions
= Regular
Languages
Proof:
Languages
Generated by  Regular
Languages
Regular Expressions
Languages
Generated by  Regular
Languages
Regular Expressions
Proof - Part 1
Languages
Generated by  Regular
Languages
Regular Expressions
For any regular expression r

the language L(r ) is regular
Proof by induction on the size of r

Induction Basis
Primitive Regular Expressions: ,  , 
Corresponding
NFAs
L( M1) =  = L()
regular
L( M 2 ) = {} = L( )
languages
a
L( M 3 ) = {a} = L(a)
Inductive Hypothesis
Suppose
that for regular expressions r1 and r2 ,
L(r1) and L(r2 ) are regular languages
Inductive Step
We will prove:
L(r1 + r2 )
L(r1  r2 )
Are regular
Languages
L(r1 *)
L((r1 ))
Using the regular closure of these operations,
we can construct recursively the NFA M
that accepts L(M ) = L(r )
Example: r = r1 + r2
L(M1 ) = L(r1 )
L(M ) = L(r )

L(M2 ) = L(r2 ) 
Regular Expression r1 Regular Expression r2
NFA M1 NFA M2
Single accepting state Single accepting state

Union
NFA for r1+r2
M1
 M2
Concatenation
NFA for r1tr2
M1 M2

Star Operation
NFA for r1*

M1
 

Example
M1
a
r1 = (a b)
*
b
M2
r2 = (ba) b a
Example
NFA for r1 + r2 = (a b) + (ba)

*
L1 = {a b} n
r1 = (a b)
*
a
b

 L2 = {ba} r2 = (ba)
b a
Example
NFA for r1r2 = (a b)(ba) = (a bba)

* *
r1 = (a b)
*
r2 = (ba)
L1 = {a b}n
a L2 = {ba}
b  b a
• An NFA Corresponding to ((aa +b)*(aba)*bab)*
ε
a ε a
ε
ε
ε b
ε
ε ε a ε a
ε b
ε
ε
b ε a ε b
ε
Problems
• Construct an NFA with null-moves, which
accepts the language defined by:
• ((0 + 1)* 10 + (00)* (11)*)*
• 01[((10)+ + 111)* + 0]* 1
• (a / b)* ab.
Problems
• Construct DFA for the R.E l0 + (0 + ll)
• Construct DFA for regular expression (0 + l)*, (00 + ll)
Review
• Construction of RE from Language
• RE and DFA
• Conversion of RE to DFA
Proof - Part 2
Languages
Generated by  Regular
Languages
Regular Expressions
For any regular language L there is

a regular expression r with L(r ) = L
We will convert an NFA that accepts L

to a regular expression
Arden’s Theorem
In order to find out a regular expression of a
Finite Automaton, we use Arden’s Theorem
along with the properties of regular
expressions.
Statement −
Let P, Q and R be the regular expressions.
If P does not contain null string, then
R = Q + RP has a unique solution that is
R = QP*
Proof −
R = Q + (Q + RP)P [After putting the value R = Q + RP]
= Q + QP + RPP
When we put the value of R recursively again
and again, we get the following equation −
R = Q + QP + QP2 + QP3…..
R = Q (ε + P + P2 + P3 + …. )
R = QP* [As P* represents (ε + P + P2 + P3 + ….) ]
Hence, proved.
Assumptions for Applying Arden’s Theorem
• The transition diagram must not have

NULL transitions
Method
Step 1 − Create equations as the following form for all the
states of the DFA having n states with initial state q1 (on
incoming edges only and add ε to the initial state ).
q1 = q1R11 + q2R21 + … + qnRn1 + ε
q2 = q1R12 + q2R22 + … + qnRn2
…………………………
qn = q1R1n + q2R2n + … + qnRnn
Rij represents the set of labels of edges

from qi to qj, if no such edge exists, then Rij = ∅
Step 2 − Solve these equations to get the equation

for the final state in terms of Rij
Problem
Construct a regular expression corresponding
to the automata given below:
Solution −
Here the initial state is q2 and the final state is q1.
The equations for the three states q1, q2, and q3 are
as follows −
q1 = q1a + q3a + ε (ε move is because q1 is the initial state0
q2 = q1b + q2b + q3b
q3 = q 2 a
Now, we will solve these three equations −
q2 = q1b + q2b + q3b
= q1b + q2b + (q2a)b (Substituting value of q3)
= q1b + q2(b + ab)

= q1b (b + ab)* (Applying Arden’s Theorem)
q1 = q1a + q3a + ε
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)

= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
Hence, the regular expression is
(a + b(b + ab)*aa)*.
Problem
Solution −
Here the initial state is q1 and the final state is q2
Now we write down the equations −
q1 = q10 + ε
q2 = q11 + q20
q 3 = q 2 1 + q3 0 + q 3 1
Now, we will solve these three equations −
q1 = ε0* [As, εR = R]
So, q1 = 0*
q2 = 0*1 + q20
So, q2 = 0*1(0)* [By Arden’s theorem]
0*10*.
• Construct the regular expressions for the
following DFAs:
Find the regular expression for the following .
(Nov-2017 4 Marks)
Review
• Construction of RE from Language
• RE and DFA
• Conversion of RE to DFA
• Conversion of DFA to RE
Construct Regular Expression for the following transition
diagram using Arden’s theorem.(Nov 2014 4 Marks)
State Elimination Method
• From M construct the equivalent
• Generalized Transition Graph
• in which transition labels are regular expressions
Example: Corresponding
M Generalized transition graph
a c a c
a, b a+b
Problem
q1 = q1a + q3a + ε
= q1a + q2aa + ε (Substituting value of q3)
= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)

= q1(a + b(b + ab)*aa) + ε
= ε (a+ b(b + ab)*aa)*
= (a + b(b + ab)*aa)*
(a + b(b + ab)*aa)*.
b b
• Another Example:
a
q0 q1 a, b q2
b
Transition labels b b
are regular a
expressions q0 q1 a + b q2
b
In General
• Removing a state: e
d c
qi q qj
a b
By repeating the process until
two states are left, the resulting graph is
Initial graph Resulting graph
r1 r4
r3
q0 qf
r2
The resulting regular expression:
r = r1 * r2 (r4 + r3r1 * r2 ) *
L( r ) = L( M ) = L
End of Proof-Part 2
Standard Representations
of Regular Languages
Regular Languages
DFAs
Regular
NFAs
Expressions
When we say: We are given
a Regular Language L
We mean: Language L is in a standard

representation
(DFA, NFA, or Regular Expression)

Non-regular languages
(Pumping Lemma)
{a b : n  0}
n n
{vv : v {a, b}*}
R
Regular languages
a *b b*c + a
b + c ( a + b) *
etc...
How can we prove that a language L
is not regular?
Prove that there is no DFA or NFA or RE

that accepts L
Difficulty: this is not easy to prove

(since there is an infinite number of them)
Solution: use the Pumping Lemma !!!

The Pigeonhole Principle
4 pigeons
3 pigeonholes
A pigeonhole must
contain at least two pigeons
n pigeons
...........
m pigeonholes nm
...........
n pigeons
m pigeonholes
There is a pigeonhole
nm with at least 2 pigeons
...........
and
DFAs
Consider a DFA with 4 states
b
b b
q1 a q2
a
q3 b q4
a a
Consider the walk of a “long’’ string: aaaab
(length at least 4)
A state is repeated in the walk of aaaab

q1 a q2 a q3 a q2 a q3 b q4
b
b b
q1 a q2 a q3 b q4
a a
The state is repeated as a result of
the pigeonhole principle
Walk of aaaab
Pigeons: q1 a q2 a q3 a q2 a q3 b q4
(walk states)
Are more than
Nests: q1 q2 q3 q4
(Automaton states) Repeated
state
Consider the walk of a “long’’ string: aabb
(length at least 4)
Due to the pigeonhole principle:

A state is repeated in the walk of aabb
q1 a q2 a q3 b q4 b q4
b
b b
q1 a q2 a q3 b q4
a a
The state is repeated as a result of
the pigeonhole principle
Walk of aabb
a a b b
Pigeons: q1 q2 q3 q4 q4
(walk states)
Are more than
Nests: q1 q2 q3 q4
(Automaton states) Repeated
Automaton States
state
In General: If | w |  # states of DFA ,
by the pigeonhole principle,
a state is repeated in the walk w
Walk of w = 1 2  k
q1  1  2 ....  i q  i +1 .... j
qi
 j +1....  k
qz
i
Arbitrary DFA
q1  1  2 ...... ...... k
qi qz
Repeated state
| w |  # states of DFA = m
Pigeons: Walk of w
(walk states)
q1 .... qi .... qi .... qz
Are
more
than
Nests: q1 q2 .... qi .... qm−1 qm

(Automaton states) A state is
repeated
The Pumping Lemma
Take an infinite regular language L
(contains an infinite number of strings)
There exists a DFA that accepts L
m
states
Take string w L with | w|  m
(number of
states of DFA)
then, at least one state is repeated

in the walk of w
Walk in DFA of
w = 1 2  k
1  2 ...... q ......  k
Repeated state in DFA
There could be many states repeated
Take q to be the first state repeated
One dimensional projection of walk w:

First Second
occurrence occurrence
1  2 ....  i q i +1 .... j  j +1....  k
q
Unique states
Review
• DFA to RE using State Elimination Method
Regular Languages: Grand Unification
L( NFA − s ) = L( NFAs ) (Parallel Simulation)

= L( DFAs )
(Rabin and Scott’s work
(Collapsing graphs;
L( FA) = L( RE ) Structural Induction)
(S. Kleene’s work)
L( FA) = L( RG ) (Construction)
L( RG ) = L( RE ) (Solving linear
equations) study later
Pumping Lemma for Regular Languages
• It is a necessary condition.
– Every regular language satisfies it.
– If a language violates it, it is not regular.
• RL => PL not PL => not RL
• It is not a sufficient condition.
– Not every non-regular language violates it.
• not RL =>? PL or not PL (no conclusion)
We can write w = xyz
One dimensional projection of walk w:

First Second
occurrence occurrence
1  2 ....  i q i +1 .... j  j +1....  k
q
x = 1 i y =  i +1 j z =  j +1 k

In DFA: w= x y z
contains only
y first occurrence of q
...
j  i +1
1 ... i
q  j +1
... ... k
x z
Observation: length | x y |  m number
of states
of DFA
y
...
Unique States
j  i +1
1 ... i
q Since, in xy no
state is repeated
x (except q)
Observation: length | y | 1
Since there is at least one transition in loop
y
...
j  i +1
q
We do not care about the form of string z
z may actually overlap with the paths of x and y

y
...
z
... q
x
Additional string: The string xz
is accepted
Do not follow loop

y
...
j  i +1
1 ... i
q  j +1
... ... k
x z
Additional string: The string xyyz
is accepted
Follow loop y
2 times
...
j  i +1
1 ... i
q  j +1
... ... k
x z
Additional string: The string xyyyz
is accepted
Follow loop y
3 times
...
j  i +1
1 ... i
q  j +1
... ... k
x z
i
In General: The string xy z
is accepted i = 0, 1, 2, ...
Follow loop y
i times
...
j  i +1
1 ... i
q  j +1
... ... k
x z
Therefore: x y z L
i
i = 0, 1, 2, ...
Language accepted by the DFA

y
...
j  i +1
1 ... i
q  j +1
... ... k
x z
In other words, we described:
The Pumping Lemma !!!

The Pumping Lemma:
• Given a infinite regular language L
• there exists an integer m (critical length)
• for any string w L with length | w| m
• we can write w= x y z
• with |x y|  m and | y | 1
• such that: xy z  L
i i = 0, 1, 2, ...
For all sufficiently long strings (w)
There exists non-null prefix (xy)
and substring (y)
For all repetitions of the substring (y),
we get strings in the language.
w  L : | w |  m 
x, y, z : ( xyz = w)
 ( | xy |  m)  ( | y |  0)
 (i : i  0  xy i z  L)
In the book:
Critical length m = Pumping length p

Applications
of
the Pumping Lemma

Observation:
Every language of finite size has to be regular
(we can easily construct an NFA
that accepts every string in the language)
Therefore, every non-regular language

has to be of infinite size
(contains an infinite number of strings)
Suppose you want to prove that
An infinite language L is not regular
1. Assume the opposite: L is regular
2. The pumping lemma should hold for L

3. Use the pumping lemma to obtain a
contradiction
4. Therefore, L is not regular

Explanation of Step 3: How to get a contradiction
1. Let m be the critical length for L
2. Choose a particular string w L which satisfies
the length condition | w | m
3. Write w = xyz
4. Show that w = xy z  L
i
for some i 1
5. This gives a contradiction, since from

pumping lemma w = xy z  L
i
Note: It suffices to show that
only one string w L
gives a contradiction
You don’t need to obtain

contradiction for every w L
Example of Pumping Lemma application
Theorem: The language L = {a nb n : n  0}

is not regular
Proof: Use the Pumping Lemma

L = {a b : n  0}
n n
Assume for contradiction

that L is a regular language
Since L is infinite
we can apply the Pumping Lemma
L = {a b : n  0}
n n
Let m be the critical length for L
Pick a string w such that: w L

and length | w| m
We pick w=a b m m
From the Pumping Lemma:
we can write w = a mb m = x y z
with lengths | x y |  m, | y | 1
m m
w = xyz = a mb m = a...aa...aa...ab...b
x y z
Thus: y = a , 1  k  m
k
x y z=a b
m m y =a , 1k m
k
From the Pumping Lemma: xy z  L

i
i = 0, 1, 2, ...
Thus: xy z  L
2
x y z=a b m m y =a , 1k m
k

2
m+k m
xy z = a...aa...aa...aa...ab...b  L
2
x y y z
m+ k m
Thus: a b L
m+ k m
a b L k ≥1
BUT: L = {a b : n  0}
n n
m+ k m
a b L
CONTRADICTION!!!
Therefore: Our assumption that L
is a regular language is not true
Conclusion: L is not a regular language
END OF PROOF
Non-regular language {a b : n  0}
n n
Regular languages
* *
L( a b )
More Applications
of
the Pumping Lemma

The Pumping Lemma:
• Given a infinite regular language L
• there exists an integer m (critical length)
• for any string w L with length | w| m
• we can write w= x y z
• with |x y|  m and | y | 1
• such that: xy z  L
i i = 0, 1, 2, ...
Non-regular languages L = {vv : v  *}
R
Regular languages
Theorem: The language
L = {vv : v  *}
R
 = {a, b}
is not regular

L = {vv : v  *}
R

Since L is infinite
L = {vv : v  *}
R
Let m be the critical length for L

and length | w| m
We pick w=a b b a
m m m m
we can write: w = a b b a =x y z
m m m m
with lengths: | x y |  m, | y | 1
m m m m
w = xyz = a...aa...a...ab...bb...ba...a
x y z
Thus: y =a , 1k m
k
x y z=a b b a
m m m m
y =a , 1k m
k

i
i = 0, 1, 2, ...
Thus: xy z  L
2
x y z=a b b a
m m m m
y =a , 1k m
k
From the Pumping Lemma: xy z  L2
m+k m m m
2
xy z = a...aa...aa...a...ab...bb...ba...a ∈L
x y y z
m+ k m m m
Thus: a b b a L
m+ k m m m
a b b a L k 1
BUT: L = {vv : v  *}

R
m+ k m m m
a b b a L
CONTRADICTION!!!
END OF PROOF
n l n +l
L = {a b c : n, l  0}
Regular languages
Theorem: The language
n l n +l
L = {a b c : n, l  0}
is not regular

n l n +l
L = {a b c : n, l  0}

Since L is infinite
n l n +l
L = {a b c : n, l  0}
Let m be the critical length of L
Pick a string w such that: w L and
length | w| m
We pick w=a b c m m 2m
We can write w =a b c m m 2m
=x y z
With lengths | x y |  m, | y | 1
m m 2m
w = xyz = a...aa...aa...ab...bc...cc...c
x y z
Thus: y =a , 1k m
k
x y z=a b c
m m 2m
y =a , 1k m
k

i
i = 0, 1, 2, ...
0
Thus: x y z = xz ∈ L
x y z=a b c
m m 2m
y =a , 1k m
k
From the Pumping Lemma: xz  L

m−k m 2m
xz = a...aa...ab...bc...cc...c  L
x z
m−k m 2 m
Thus: a b c L
m−k m 2 m
a b c L k 1
BUT: n l n +l
L = {a b c : n, l  0}
m−k m 2 m
a b c L
CONTRADICTION!!!
END OF PROOF
Non-regular languages L = {a : n  0}
n!
Regular languages
Theorem: The language L = {a : n  0}
n!
is not regular
n! = 1 2  (n − 1)  n

L = {a : n  0}
n!

Since L is infinite
L = {a : n  0}
n!
Let m be the critical length of L

length | w| m
We pick w=a m!
We can write w =a m!
=x y z
With lengths | x y |  m, | y | 1
m m!−m
w = xyz = a m!
= a...aa...aa...aa...aa...a
x y z
Thus: y = a , 1 k  m
k
x y z=a m!
y = a , 1 k  m
k

i
i = 0, 1, 2, ...
Thus: xy z  L
2
x y z=a m!
y = a , 1 k  m
k

2
m+k m!−m
xy z = a...aa...aa...aa...aa...aa...a  L
2
x y y z
m!+ k
Thus: a L
m!+ k
a L 1 k  m
Since: L = {a : n  0}
n!
There must exist p such that:
m!+ k = p!
m!+ k 1 k  m
a L
BUT: L = {a : n  0}
n!
m!+ k
a L
CONTRADICTION!!!
END OF PROOF
Review
• Pumping Lemma
Closure Properties of
Regular Languages
Regular Languages
• If ∑ is an alphabet, the set R of regular
languages over ∑ is defined as follows.
• 1. The language ∅ is an element of R, and
for every a ∈ , the language {a} is in R.
• 2. For any two languages L1 and L2 in R, the
three languages L1 ∪ L2, L1L2, and L1* are
elements of R.
For regular languages L1 and L2
we will prove that:
Union: L1  L2
Concatenation: L1L2
Star: Are regular
L1 *
Languages
Reversal: L1R
Complement: L1
Intersection: L1  L2
We say: Regular languages are closed under
Union: L1  L2
Concatenation: L1L2
Star: L1 *
Reversal: L1R
Complement: L1
Intersection: L1  L2
A useful transformation: use one accept state
NFA
a
b 2 accept states
a
b
Equivalent
NFA a
1 accept state
a b
b 

In General
NFA
Equivalent NFA
 Single
 accepting
 state
Extreme case
NFA without accepting state
Add an accepting state

without transitions
Take two languages
Regular language L1 Regular language L2
L(M1 ) = L1 L(M 2 ) = L2
NFA M1 NFA M2
Single accepting state Single accepting state

Example
M1
n0
a
L1 = {a b}
n
b
M2
L2 = ba b a
Union
NFA for L1  L2
M1
 M2
Example
NFA for L1  L2 = {a b}  {ba}

n
L1 = {a b}n
a
b

 L2 = {ba}
b a
Concatenation
NFA for L1L2
M1 M2

Example
NFA for L1L2 = {a b}{ba} = {a bba}

n n
L1 = {a b}n
a L2 = {ba}
b  b a
Star Operation
NFA for L1 * w = w1w2  wk
 wi  L1
M1
  L1 *
 

Example
NFA for L1* = {a b} *

n
L1 = {a b} n
a
 b 

Reverse
R
NFA for L1
L1 M1 M1
1. Reverse all transitions
2. Make initial state accepting state

and vice versa
Example
M1
a
L1 = {a b}
n
b
M1
a
R
L1 = {ba }
n b
Complement
L1 M1 L1 M1
1. Take the DFA that accepts L1
2. Make accepting states non-final,

and vice-versa
Example
M1
a a, b
L1 = {a b}
n b a, b
M1
a a, b
L1 = {a, b} * −{a b}
n
b a, b
Intersection
L1 regular
We show L1  L2
L2 regular regular
DeMorgan’s Law: L1  L2 = L1  L2
L1 , L2 regular
L1 , L2 regular
L1  L2 regular
L1  L2 regular
L1  L2 regular
Example
L1 = {a b}
n regular
L1  L2 = {ab}
L2 = {ab, ba} regular regular
Another Proof for Intersection Closure
Machine M1 Machine M2
DFA for L1 DFA for L2
Construct a new DFA M that accepts L1  L2
M simulates in parallel M1 and M2

States in M
qi , p j
State in M1 State in M2
DFA M1 DFA M2
q1 a q2 p1 a p2
transition transition
DFA M
q1, p1 a q2 , p2
New transition
DFA M1 DFA M2
q0 p0
initial state initial state
DFA M
q0 , p0
New initial state
DFA M1 DFA M2
qi pj pk
accept state accept states
DFA M
qi , p j qi , pk
New accept states
Both constituents must be accepting states

Example:
n0 m0
L1 = {a b} n
L2 = {ab } m
M1 M2
a b
q0 b q1 p0 a p1
a, b b a
q2 p2
a, b a, b
Automaton for intersection
n n
L = {a b}  {ab } = {ab}
a, b
q0 , p0 a q0 , p1 b q1, p1 a q2 , p2
b a b a
q1, p2 b q0 , p2 q2 , p1
a b
a, b
M simulates in parallel M1 and M 2
M accepts string w if and only if:
M1 accepts string w
and M2 accepts string w
L ( M ) = L ( M1 )  L ( M 2 )
Applications of Regular Expression
• Regular Expressions in Lexical Analysis
• Regular Expressions in Web Search Engines
• Regular Expressions in Software Engineering
•Thank You….

Unit II Regular Expression

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

Unit II Regular Expression

Загружено:

Авторское право:

Доступные форматы

THEORY OF COMPUTATION

Jagadish Kashinath Kamble (M.Tech, IIT Kharagpur)

Regular Generator Acceptor Finite

Right Linear Grammer

Left Linear Grammer FA without Output

describes the language

Primitive regular expressions: ,  , 

A regular expression: (a + b  c ) * (c + )

Not a regular expression: (a + b + )

L (r ) : language of regular expression r

L((a + b  c) *) =  , a, bc, aa, abc, bca,...

For primitive regular expressions:

For regular expressions r1 and r2

L(r1 + r2 ) = L(r1 )  L(r2 )

L(r1  r2 ) = L(r1 ) L(r2 )

L(r ) = a, bb, aa, abb, ba, bbb,...

Regular expression r = (aa ) * (bb ) * b

Regular expressions r1 and r2

are equivalent if L(r1) = L(r2 )

For any regular expression r

Proof by induction on the size of r

Single accepting state Single accepting state

NFA for r1tr2

NFA for r1 + r2 = (a b) + (ba)

NFA for r1r2 = (a b)(ba) = (a bba)

For any regular language L there is

We will convert an NFA that accepts L

• The transition diagram must not have

Rij represents the set of labels of edges

Step 2 − Solve these equations to get the equation

= q1b + q2(b + ab)

= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)

= q1a + q1b(b + ab*)aa + ε (Substituting value of q2)

We mean: Language L is in a standard

(DFA, NFA, or Regular Expression)

Prove that there is no DFA or NFA or RE

Difficulty: this is not easy to prove

Solution: use the Pumping Lemma !!!

A state is repeated in the walk of aaaab

Are more than

Due to the pigeonhole principle:

Are more than

Nests: q1 q2 .... qi .... qm−1 qm

There exists a DFA that accepts L

then, at least one state is repeated

Take q to be the first state repeated

One dimensional projection of walk w:

L( NFA − s ) = L( NFAs ) (Parallel Simulation)

One dimensional projection of walk w:

x = 1 i y =  i +1 j z =  j +1 k

z may actually overlap with the paths of x and y

Do not follow loop

Language accepted by the DFA

The Pumping Lemma !!!

• for any string w L with length | w| m

Critical length m = Pumping length p

the Pumping Lemma

Therefore, every non-regular language

1. Assume the opposite: L is regular

2. The pumping lemma should hold for L

4. Therefore, L is not regular

5. This gives a contradiction, since from

You don’t need to obtain