Вы находитесь на странице: 1из 76

CS301-Theory of Automata

Subhash Sagar
Email: Subhash.sagar@nu.edu.pk

1
Instructor Contacts
 Subhash Sagar

 Room-16, Opposite Computer Science Secretariat.


 subhash.sagar@nu.edu.pk

 Office Hours: Will be decided soon.

2
Pre-requisites
 Discrete Structure

3
Books (Text and Reference)
(Textbook)
“Introduction to Automata Theory, Languages and Computation”, By J.E.
Hopcroft, R. Motwani, J.D. Ullman, 3rd Edition, Addison Wesley/Pearson.
(Reference)

“Introduction to Computer Theory”, by Daniel I. Cohen, John Wiley and Sons,


Inc., 1991, Second Edition.
“Introduction to Languages and Theory of Computation”, by J. C. Martin,
McGraw Hill Book Co., 1997, Second Edition

4
Grading Policy
 Assignments, Quizzes and Projects: 25%
 Midterm Exam: 25%
 Final: 50%

5
Objectives
 Introduce concepts in automata theory and theory of computation
 Identify different formal language classes and their relationships
 Design grammars and recognizers for different formal languages
 Prove or disprove theorems in automata theory using its properties
 Determine the decidability and intractability of computational
problems

6
Course Outline
 Introduction
 Formal Languages and Finite Automata
 Grammars
 Turing Machines
 Un-Decidability/Intractable problems.

7
Lecture basics
 Classes will involve both Slides + Board (to roughly equal degrees)
 Lecture slides available online
– However, no scribes from class will be made available
– So, take your own notes in class

 For latest/updated slides, download before each use


 Use of laptops and smart phones not allowed in classroom.

8
Introduction to Theory of Automata

9
(A pioneer of automata theory)

Alan Turing (1912-1954)


 Father of Modern Computer Science
 English mathematician
 Studied abstract machines called Turing machines
even before computers existed
 Heard of the Turing test?

10
Theory of Computation: A Historical Perspective
1930s • Alan Turing studies Turing machines
• Decidability
• Halting problem

1940-1950s • “Finite automata” machines studied


• Noam Chomsky proposes the
“Chomsky Hierarchy” for formal
languages

1969 Cook introduces “intractable” problems


or “NP-Hard” problems

1970- Modern computer science: compilers,


computational & complexity theory evolve

11
Introduction to Automata Theory
 The word “Automata“ is the plural of “automaton" which simply means any
machine or “something that works automatically”.
 Automata theory is the study of abstract machines and problems they are
able to solve.
 Automata theory is closely related to formal language theory as the
automata are often classified by the class of formal languages they are able
to recognize.
 Theoretical developments bear directly on what computer scientists do
today
– Finite automata, formal grammars: design/ construction of software
– Turing machines: help us understand what we can expect from a software
– Theory of intractable problems: are we likely to be able to write a program to solve a
given problem? Or we should try an approximation, a heuristic...

12
Abstract Machine
 Abstract devices are (simplified) model of real computation.
 An abstract machine, also called an abstract computer, is a theoretical
model of a computer hardware or software system used in Automata
theory.
 Abstraction of computing processes is used in both the computer science
and computer engineering disciplines and usually assumes discrete time
paradigm.
 abstract machines are often used in thought experiments regarding
computability or to analyze the complexity of algorithms.

13
Models or Abstract Models
 The construction of models is one of the essentials of any scientific
discipline.
 The usefulness of a discipline is often dependent on the existence of simple,
yet powerful, theories and laws.

14
Automaton
 An automaton is an abstract model of a digital computer
 It has a mechanism to read input (string over a given alphabet, e.g. strings of
0’s and 1’s on S = {0,1}) written on an input file.
 A finite automaton has a set of states
 Its control moves from state to state in response to external “inputs”

15
Automaton
 With every automaton, a transition function is associated which gives the
next state in terms of the current state
 An automaton can be represented by a graph in which the vertices give the
internal states and the edges transitions
 The labels on the edges show what happens (in terms of input and output)
during the transitions
 An automaton operates in discrete time frame

16
Components of an automaton
 Input file: Contains strings of input symbols
 Storage unit: consists of an unlimited number of cells, each capable of
holding a single symbol from an alphabet
 Control unit: can be in any one of a finite number of internal states and can
change states in defined manner

17
Why Study Automata?
 A variety of properties concerning the models, grammars, and languages will
be proven.
 The existence or non-existence of algorithms for processing languages and
language processors will be proven.
 These algorithms form the basis of tools for processing languages, e.g.,
parsers, compilers, assemblers, etc.
 Other algorithms will form the basis of tools that automatically construct
language processors, e.g., yacc, lex, etc.
– Note that our perspective will be similar to, yet different from a compiler class.
 Additionally, some things will be proven to be non-computable, e.g., the
enhanced compiler.

18
Why Study Automata?
Finite automata are a useful model for many important kinds of software and
hardware:
 Software for designing and checking the behavior of digital circuits
 The lexical analyzer of a typical compiler, that is, the compiler component
that breaks the input text into logical units
 Software for scanning large bodies of text, such as collections of Web pages,
to find occurrences of words, phrases or other patterns.
 Software for verifying systems of all types that have a finite number of
distinct states, such as communications protocols of protocols for secure
exchange information.

19
 In general, this subject plays a major role in:
– Theory of Computation
– Compiler Construction
– Parsing
– Formal Verification
– Defining Computer Languages

20
Different kinds of automata

 This was only one example of a computational device, and there are others
 We will look at different devices, and look at these kinds of questions:
– What kinds of problems can a given type of device solve?
– What things are impossible for this kind of device?
– Is one type of device more powerful than another?

21
Some Devices
Finite automata
 Devices with a finite amount of memory. Used to model “small” computers.
Push-down automata
 Devices with infinite memory that can be accessed in a restricted way.
 Used to model parsers, etc.
– Parsing (syntax analysis or syntactic analysis) is the process of analyzing a string of symbols, either in
natural language, computer languages or data structures, conforming to the rules of a formal
grammar.
Turing Machines
 Devices with infinite memory. Used to model any computer.
Time-bounded Turing Machines
 Infinite memory, but bounded running time.
 Used to model any computer program that runs in a “reasonable” amount of time.

22
The Chomsky Hierarchy
• A containment hierarchy of classes of formal languages.

Context-
Context- Recursively-
Regular sensitive
free grammar enumerable
(DFA) grammar
(PDA)s (TM)
(LBA)

23
Languages & Grammars
 Languages: “A language is a collection of
sentences of finite length all constructed
Or “words” from a finite alphabet of symbols”
 Grammars: “A grammar can be regarded as
a device that enumerates the sentences of a
language” - nothing more, nothing less
 N. Chomsky, Information and Control, Vol 2,
1959

24
The Central Concepts of Automata Theory

25
Alphabets
 Definition:
A finite non-empty set of symbols (letters), is called an alphabet.
It is denoted by Σ ( Greek letter sigma).
 Example:
Σ={a,b}
Σ={0,1} //important as this is the language which the computer
understands.
Σ={i,j,k}

26
Note:
 A certain version of language ALGOL has 113 letters.

 Σ (alphabet) includes letters, digits and a variety of operators including


sequential operators such as GOTO and IF.

27
Powers of an Alphabet
Let ∑ be an alphabet.
 ∑k = the set of all strings of length k
 ∑* = ∑0 U ∑1 U ∑2 U …
 ∑+ = ∑1 U ∑2 U ∑3 U …

Note: confusion between Σ and ∑1 :


 Σ is an alphabet; its members 0 and 1 are symbols.
 ∑1 is a set of strings; its members are strings (each one of length 1)

28
The * Operation (Kleene Star)
 * : the set of all possible strings from
alphabet 

  a, b
*   , a, b, aa, ab, ba, bb, aaa, aab,

29
 Σ* : The set of all strings over an alphabet
Σ {0, 1}* = {ǫ, 0, 1, 00, 01, 10, 11, 000, . . .}

 Σ* = Σ0 ∪ Σ1 ∪ Σ2 ∪ . . . The symbol * is called Kleene star and is


named after the mathematician and logician Stephen Cole Kleene.
Σ+ = Σ1 ∪ Σ2 ∪ . . . Thus: Σ* = Σ+ ∪ {ǫ}

30
The + Operation
 : the set of all possible strings from

alphabet  except 
  a, b
*   , a, b, aa, ab, ba, bb, aaa, aab,

    * {}

  a, b, aa, ab, ba, bb, aaa, aab,

31
TASKs
Q1)Is there any case when S+ contains Λ? If yes then justify your answer.

Q2) Prove that for any set of strings S


i. (S+)*=(S*)*
ii. (S+)+=S+
iii. Is (S*)+=(S+)*

32
Valid/Invalid Alphabets:
 Rule: If we have used any alphabet then it should not be prefix in the
next combination or alphabet.
 For instance following are some of the demonstrations where we’ll
declare some sets valid and some will be declared as invalid.
Σ = {a,b} ✓ Valid
Σ = {a,ba,c} ✓ Valid
Σ = {a,ab,c} ☓ Invalid

33
Valid/In-valid alphabets
 While defining an alphabet, an alphabet may contain letters
consisting of group of symbols for example Σ1= {B, aB, bab,
d}.

 Now consider an alphabet


Σ2= {B, Ba, bab, d}

34
Remarks:
While defining an alphabet of letters consisting of more than
one symbols, no letter should be started with the letter of the
same alphabet i.e. one letter should not be the prefix of
another. However, a letter may be ended in the letter of same
alphabet i.e. one letter may be the suffix of another.

35
Conclusion
 Σ1= {B, aB, bab, d}
 Σ2= {B, Ba, bab, d}

Σ1 is a valid alphabet while Σ2 is an in-valid alphabet.

36
Strings
 Definition:
Concatenation of finite symbols from the alphabet is called a string.

 Example:
If Σ= {a,b} then
a, abab, aaabb, ababababababababab

37
EMPTY String or NULL String
 Sometimes a string with no symbol at all is used, denoted by (Small
Greek letter Lambda) λ or (Capital Greek letter Lambda) Λ, is called
an empty string or null string.
 The capital lambda will mostly be used to denote the empty string,
in further discussion.
 NULL means nothing. Its just a literal “NULL” is the value of
reference variable. But empty string is blank. It gives the
“Length=0” Empty string is a blank value, means the string does not
have any thing.

 String s; //Inits to null String a =""; //A blank string

38
String Operations
 String Concatenation:
As the name suggest string concatenation means to join two different
strings for e.g.:
W=a1,a2,a3 and V=b1,b2,b3
then WV=a1,a2,a3,b1,b2,b3

39
String Operations (Cont.)
 String Reverse:
Simply string reverse means to reverse a string or combination of string.
For example:
U=ab then U(reverse)=ba

 Similarly if we have (UV) then (UV)reverse=(U)reverse * (V)reverse

40
String Operations (Cont.)
 String Length:
As the name suggest string length refers to the length of a string or strings in a
set.
Length = |W|=n
For example:
Σ={a,b}
U=abba
then |U|=4

41
Words
 Definition: Words are strings belonging to some language.

Example:
If Σ= {x} then a language L can be defined as
L={xn : n=1,2,3,…..} or L={x,xx,xxx,….}
Here x,xx,… are the words of L

42
Note:
 All words are strings, but not all strings are words.

43
Some other Operation
w  ww
n

 
w
n

Example: abba  abbaabba


2

Definition: w 
0

abba   
0

44
Languages
A language is any subset of *

Example:
  a, b
*   , a, b, aa, ab, ba, bb, aaa,
Languages:

a, aa, aab
{ , abba, baba, aa, ab, aaaaaa}

45
Note that:

Sets   { }  {}
Set size {}    0

Set size {}  1

46
Another Example
L  {a b : n  0}
n n
An infinite language


ab
L abb  L
aabb
aaaaabbbbb
47
Operations on Languages
The usual set operations

a, ab, aaaa  bb, ab  {a, ab, bb, aaaa}


a, ab, aaaa  bb, ab  {ab}
a, ab, aaaa  bb, ab  a, aaaa
L   * L
Complement:
a, ba   , b, aa, ab, bb, aaa,
48
Reverse
Definition: L  {w : w  L}
R R

Examples: ab, aab, baba  ba, baa, abab


R

L  {a b : n  0}
n n

L  {b a : n  0}
R n n

49
Concatenation
Definition: L1L2  xy : x  L1, y  L2 

a, ab, bab, aa


Example:

 ab, aaa, abb, abaa, bab, baaa

50
Another Operation
Definition: Ln   
LL L
n

a, b  a, ba, ba, b 


3

aaa, aab, aba, abb, baa, bab, bba, bbb


L0  
Special case:

0
a , bba , aaa   
51
More Examples

L  {a b : n  0}
n n

L  {a b a b : n, m  0}
2 n n m m

2
aabbaaabbb L

52
Star-Closure (Kleene *)

Definition: L*  L  L  L 
0 1 2

 , 
a, bb, 
 
Example: a, bb*   
aa, abb, bba, bbbb, 
aaa, aabb, abba, abbbb,

53
Positive Closure
Definition: L  L1  L2  

a, bb, 
  
a, bb  aa, abb, bba, bbbb, 
aaa, aabb, abba, abbbb,
 

54
Defining Languages

55
Languages
 The languages can be defined in different ways , such as Descriptive
definition, Recursive definition, using Regular Expressions(RE) and
using Finite Automaton(FA) etc.

There are two types of languages


Formal Languages: A predefined set of symbols. Syntactic Aspects.
Informal Languages: Such as English (many different version).

56
Descriptive Definition of Language:
(Method-1)
 The language is defined, describing the conditions imposed on its
words.

57
Examples:
 Example:
The language L of strings of odd length, defined over Σ={a}, can be
written as
L={a, aaa, aaaaa,…..}
 Example:
The language L of strings that does not start with a, defined over
Σ={a,b,c}, can be written as
L={b, c, ba, bb, bc, ca, cb, cc, …}

58
Examples (Cont.):
 Example:
The language L of strings of length 2, defined over Σ={0,1,2}, can be
written as
L={00, 01, 02,10, 11,12,20,21,22}
 Example:
The language L of strings ending in 0, defined over Σ ={0,1}, can be
written as
L={0,00,10,000,010,100,110,…}

59
Examples (Cont.):
 Example: The language EQUAL, of strings with number of a’s equal to
number of b’s, defined over Σ={a,b}, can be written as

{Λ ,ab,aabb,abab,baba,abba,…}
 Example: The language EVEN-EVEN, of strings with even number of
a’s and even number of b’s, defined over Σ={a,b}, can be written as
{Λ, aa, bb, aaaa,aabb,abab, abba, baab, baba, bbaa, bbbb,…}

60
Examples (Cont.):
 Example: The language INTEGER, of strings defined over Σ={-
,0,1,2,3,4,5,6,7,8,9}, can be written as
INTEGER = {…,-2,-1,0,1,2,…}
 Example: The language EVEN, of stings defined over Σ={-
,0,1,2,3,4,5,6,7,8,9}, can be written as
EVEN = { …,-4,-2,0,2,4,…}

61
Examples (Cont.):
 Example: The language {anbn }, of strings defined over Σ={a,b}, as
{an bn : n=1,2,3,…}, can be written as
{ab, aabb, aaabbb,aaaabbbb,…}

 Example: The language {anbnan }, of strings defined over Σ={a,b}, as


{an bn an: n=1,2,3,…}, can be written as
{aba, aabbaa, aaabbbaaa,aaaabbbbaaaa,…}

62
Examples (Cont.):
 Example: The language factorial, of strings defined over
Σ={1,2,3,4,5,6,7,8,9} i.e.
{1,2,6,24,120,…}
 Example: The language FACTORIAL, of strings defined over Σ={a}, as
{an! : n=1,2,3,…}, can be written as
{a,aa,aaaaaa,…}. It is to be noted that the language FACTORIAL can
be defined over any single letter alphabet.

63
Examples (Cont.):
 Example: The language DOUBLEFACTORIAL, of strings defined over
Σ={a, b}, as
{an!bn! : n=1,2,3,…}, can be written as
{ab, aabb, aaaaaabbbbbb,…}
 Example: The language SQUARE, of strings defined over Σ={a}, as
2
n
{a : n=1,2,3,…}, can be written as
{a, aaaa, aaaaaaaaa,…}

64
Examples (Cont.):
 Example: The language DOUBLESQUARE, of strings defined over
Σ={a,b}, as
2 2
n n
{a b : n=1,2,3,…}, can be written as
{ab, aaaabbbb, aaaaaaaaabbbbbbbbb,…}

65
Examples (Cont.):
 Example: The language PRIME, of strings defined over Σ={a}, as
{ap : p is prime}, can be written as
{aa,aaa,aaaaa,aaaaaaa,aaaaaaaaaaa…}

66
An Important language

 PALINDROME:
The language consisting of Λ and the strings s defined over Σ such
that Rev(s)=s.
It is to be denoted that the words of PALINDROME are called
palindromes.
 Example: For Σ={a,b},
PALINDROME={Λ , a, b, aa, bb, aaa, aba, bab, bbb, ...}

67
Recursive Definition of Language:
(Method-2) The following three steps are used in recursive definition:

 Some basic words are specified in the language.


 Rules for constructing more words are defined in the language.
 No strings except those constructed in above, are allowed to be in the
language.

68
Example
 Defining language of INTEGER
Step 1:
1 is in INTEGER.
Step 2:
If x is in INTEGER then x+1 and x-1 are also in INTEGER.
Step 3:
No strings except those constructed in above, are allowed to be in INTEGER.

69
Example
 Defining language of EVEN
Step 1:
2 is in EVEN.
Step 2:
If x is in EVEN then x+2 and x-2 are also in EVEN.
Step 3:
No strings except those constructed in above, are allowed to be in EVEN.

70
Example
 Defining the language factorial
Step 1:
As 0!=1, so 1 is in factorial.
Step 2:
n!=n*(n-1)! is in factorial.
Step 3:
No strings except those constructed in above, are allowed to be in factorial.

71
Example
 Defining the language PALINDROME, defined over Σ = {a,b}
Step 1:
a and b are in PALINDROME
Step 2:
if x is palindrome, then s(x)Rev(s) and xx will also be palindrome, where s
belongs to Σ*
Step 3:
No strings except those constructed in above, are allowed to be in
palindrome

72
Example
 Defining the language {anbn }, n=1,2,3,… , of strings defined over Σ={a,b}
Step 1:
ab is in {anbn}
Step 2:
if x is in {anbn}, then axb is in {anbn}
Step 3:
No strings except those constructed in above, are allowed to be in {anbn}

73
Example
 Defining the language L, of strings ending in a , defined over Σ={a,b}
Step 1:
a is in L
Step 2:
if x is in L then s(x) is also in L, where s belongs to Σ*
Step 3:
No strings except those constructed in above, are allowed to be in L

74
Example
 Defining the language L, of strings beginning and ending in same letters ,
defined over Σ={a, b}
Step 1:
a and b are in L
Step 2:
(a)s(a) and (b)s(b) are also in L, where s belongs to Σ*
Step 3:
No strings except those constructed in above, are allowed to be in L

75
Example
 Defining the language L, of strings containing aa or bb , defined over
Σ={a, b}
Step 1:
aa and bb are in L
Step 2:
s(aa)s and s(bb)s are also in L, where s belongs to Σ*
Step 3:
No strings except those constructed in above, are allowed to be in L

76

Вам также может понравиться