Вы находитесь на странице: 1из 24

Languages and Grammars

MSU CSE 260


Outline
Introduction: Example
Phrase-Structure Grammars: Terminology, Definition,
Derivation, Language of a Grammar, Examples
Exercise 10.1 (1)
Types of Phrase-Structure Grammars
Derivation Trees: Example, Parsing
Exercise 10.1 (2, 3)
Backus-Naur Form
Introduction
In the English language, the grammar determines
whether a combination of words is a valid
sentence.
Are the following valid sentences?
The large rabbit hops quickly. Yes
The frog writes neatly. Yes
Swims quickly mathematician. No
Grammars are concerned with the syntax (form) of
a sentence, and NOT its semantics (or meaning.)
English Grammar
Sentence: noun phrase followed by verb phrase;
Noun phrase: article adjective noun, or article
noun;
Verb phrase: verb adverb, or verb;
Article: a, or the;
Adjective: large, or hungry;
Noun: rabbit, or mathematician, or frog;
Verb: eats, or hops, or writes, or swims;
Adverb: quickly, or wildly, or neatly;
Example
Sentence
Noun phrase verb phrase
Article adjective noun verb phrase
Article adjective noun verb adverb
the adjective noun verb adverb
the large noun verb adverb
the large rabbit verb adverb
the large rabbit hops adverb
the large rabbit hops quickly
Grammars and Computation
Grammars are used as a model of
computation.
Grammars are used to:
generate the words of a language, and
determine whether a word is in a language.
Phrase-Structure Grammars
Terminology
Definitions. A vocabulary (or alphabet) V is a
finite, nonempty set of elements called symbols.
A word (or sentence) over V is a string of finite
length of elements of V.
The empty string (or null string,) denoted by , is
the string containing no symbols.
The set of all words over V is denoted by V*.
A language over V is a subset of V*.
Phrase-Structure Grammars
A language can be specified by:
listing all the words in the language, or
giving a set of criteria satisfied by its words, or
using a grammar.
A grammar provides:
a set of symbols, and
a set of rules, called productions, for producing words
by replacing strings by other strings: w
0
w
1
.
Phrase-Structure Grammar
Definition
A phrase-structure grammar G = (V, T, S, P)
consists of:
a vocabulary V,
a subset T of V consisting of terminal elements,
a start symbol S from V, and
a set P of productions.
The set N = V-T consists of nonterminal symbols.
Every production in P must contain at least one
nonterminal on its left side.
Phrase-structure Grammar
Example
G = {V, T, S, P}, where
V = {a, b, A, B, S},
T = {a, b},
S is the start symbol, and
P = { S Aba,
A BB,
B ab,
AB b}.
Phrase-Structure Grammars
Derivation
Definition.
Let G = (V, T, S,P) be a phrase-structure grammar.
Let w
0
= lz
0
r and w
1
= lz
1
r be strings over V.
If z
0
z
1
is a production of G, we say that:
w
1
is directly derivable from w
0
(denoted: w
0
w
1
.)
If w
0
, w
1
, , w
n
are strings over V such that:
w
0
w
1
, w
1
w
2
, ,

w
n-1
w
n
, we say that:
w
n
is derivable from w
0
(denoted: w
0
* w
n
.)
Note. * should be on top of .
The sequence of all steps used to obtain w
n
from w
0
is
called a derivation.
Example
In the previous example grammar, the
production: B ab makes the string Aaba
directly derivable from string ABa.
ABa Aaba
Also Aaba BBaba Bababa abababa
using: A BB, B ab, and B ab.
So: ABa * abababa
abababa is derivable from ABa.
Language of a Grammar
Definition.
Let G = (V, T, S, P) be a phrase-structure
grammar.
The language generated by G (or the
language of G), denoted by L(G), is the set
of all strings of terminals that are derivable
from the start symbol S.
L(G) = {wT* | S * w}.
Example
Let G = {V, T, S, P} be the grammar where:
V = {S, 0, 1},
T = {0, 1},
P = { S 11S,
S 0}.
What is L(G)?
At any stage of the derivation we can either:
add two 1s at the end of the string, or
terminate the derivation by adding a 0 at the end of the string.
L(G)={0, 110, 11110, 1111110, } = Set of all strings
that begin with an even number of 1s and end with 0.
Exercise 10.1 (1)

Types of Grammars
A type 0 (phrase-structure) grammar has no
restrictions on its productions.
A type 1 (or context-sensitive) grammar has
productions only of forms:
w
1
w
2
with length of w
2
length of w
1
, or
w
1
.
A type 2 (or context-free) grammar has
productions only of the form A w
2
, where A is a
single nonterminal symbol.
Types of Grammars cont.
A type 3 (or regular) grammar has productions
only of the form:
A aB, or A a, where
A and B are nonterminal symbols, and
a is a terminal symbol, or
S .
Note.
Every type 3 grammar is a type 2 grammar
Every type 2 grammar is a type 1 grammar
Every type 1 grammar is a type 0 grammar
Types of Grammars - Summary
Type Restrictions on productions w
1
w
2

0 No restrictions
1 l(w
1
) l(w
2
), or w
2
=
2 w
1
=A where AN
3 w
1
=A, and w
2
=aB or w
2
=a, where
AN, BN, aT, or
w
1
=S and w
2
=
Derivation Trees
For type 2 (context-free) grammars:
A derivation (or parse) tree, is an ordered rooted
tree that represents a derivation in the language
generated by a context-free grammar, where:
the root represents the starting symbol;
the internal vertices represent nonterminal symbols;
the leaves represent the terminal symbols;
for a production A w, the vertex representing A will
have children vertices that represent each symbol in w.
Example
Derivation tree for:
the hungry rabbit eats quickly
sentence
noun phrase verb phrase
article adjective noun verb adverb
the hungry rabbit eats quickly
Exercise 10.1 (2, 3)

Parsing
To determine whether a string is in the
language generated by a grammar, use:
Top-down parsing:
Begin with S and attempt to derive the word by
successively applying productions, or
Bottom-up parsing:
Work backward: Begin by inspecting the word and
apply productions backward.
Example
Let G = {V, T, S, P} be the grammar where:
V = {a, b, c, A, B, C, S}, T = {a, b, c},
Productions: Determine whether cbab is in L(G)?
S AB Top-down parsing:
A Ca S AB
B Ba S AB CaB
B Cb S AB CaB cbaB
B b S AB CaB cbaB cbab
C cb Bottom-up parsing:
C b Cab cbab
Ab Cab cbab
AB Ab Cab cbab
S AB Ab Cab cbab
Backus-Naur Form
Used with type 2 (context-free) grammars; like for
specification of programming languages:
Use ::= instead of
Enclose nonterminal symbols within < >
Group productions with same left side with symbol |
Example.
<signed integer> ::= <sign><integer>
<sign> ::= + | -
<integer> ::= <digit> | <digit><integer>
<digit> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Вам также может понравиться