LR Grammars

LR-Grammars
LR(0), LR(1), and LR(K)
Deterministic Context-Free Languages

DCFL A family of languages that are accepted by a Deterministic Pushdown Automaton (DPDA) Many programming languages can be described by means of DCFLs
Prefix and Proper Prefix

Prefix (of a string)

Any number of leading symbols of that string Example: abc

Prefixes: I, a, ab, abc
Proper Prefix (of a string)

A prefix of a string, but not the string itself Example: abc

Proper prefixes: I, a, ab
Prefix Property

Context-Free Language (CFL) L is said to have the prefix property whenever w is in L and no proper prefix of w is in L Not considered a serve restriction

Why?

Because we can easily convert a DCFL to a DCFL with the prefix property by introducing an endmarker
Suffix and Proper Suffix

Suffix (of a string)

Any number of trailing symbols A suffix of a string, but not the string itself
Proper Suffix

Example Grammar

This is the grammar that will be used in many of the examples:

S Sc S SA | A A aSb | ab
LR-Grammar

Left-to-right scan of the input producing a rightmost derivation Simply:

L stands for Left-to-right R stands for rightmost derivation
LR-Items

An item (for a given CFG)

A production with a dot anywhere in the right side (including the beginning and end) In the event of an I-production: B I

is an item
Example: Items

Given our example grammar:

Sc, S
SA|A, A
aSb|ab
The items for the grammar are:
S Sc, S Sc, S Sc S SA, S SA, S SA, S A, S A A aSb, A aSb, A aSb, A aSb, A ab, A ab, A ab
Some Notation

* = 1 or more steps in a derivation *rm = rightmost derivation rm = single step in rightmost derivation
Right-Sentential Form

A sentential form that can be derived by a rightmost derivation

A string of terminals and variables E is called a sentential form if S* E
More terms

Handle

A substring which matches the right-hand side of a production and represents 1 step in the derivation Or more formally:

(of a right-sentential form K for CFG G) Is a substring F such that:

S *rm HFw HFw = K
If the grammar is unambiguous:

There are no useless symbols The rightmost derivation (in right-sentential form) and the handle are unique
Example


Sc, S
SA|A, A
aSb|ab
An example right-most derivation:

S Sc SAc SaSbc
Therefore we can say that: SaSbc is in right-sentential form

The handle is aSb
More terms

Viable Prefix

(of a right-sentential form for K) Is any prefix of K ending no farther right than the right end of a handle of K. An item where the dot is the rightmost symbol
Complete item

Example


Sc, S
SA|A, A
aSb|ab
The right-sentential form abc:

S *rm Ac abc
Valid prefixes:
A ab for prefix ab A ab for prefix a A ab for prefix I A ab is a complete item, @ Ac is the right-sentential form for abc

LR(0)

Left-to-right scan of the input producing a rightmost derivation with a look-ahead (on the input) of 0 symbols It is a restricted type of CFG 1st in the family of LR-grammars LR(0) grammars define exactly the DCFLs having the prefix property
Computing Sets of Valid Items

The definition of LR(0) and the method of accepting L(G) for LR(0) grammar G by a DPDA depends on:

Knowing the set of valid items for each prefix K
For every CFG G, the set of viable prefixes is a regular set

This regular set is accepted by an NFA whose states are the items for G
Continued

Given an NFA (whose states are the items for G) that accepts the regular set

We can apply the subset construction to this NFA and yield a DFA The DFA whose state is the set of valid items for K
NFA M

NFA M recognizes the viable prefixes for CFG

M = (Q, V T, H, q0, Q)

Q = set of items for G plus state q0
G = (V, T, P, S)
Three Rules

H(q0,I) = {S E| S E is a production} H(A EBF,I) = {B K| B K is a production}

Allows expansion of a variable B appearing immediately to the right of the dot Permits moving the dot over any grammar symbol X if X is the next input symbol
H(A EXF, X) = {A EXF}

Theorem 10.9

The NFA M has property that H(q0, K) contains A EF iff A EF is valid for K This theorem gives a method for computing the sets of valid items for any viable prefix

Note: It is an NFA. It can be converted to a DFA. Then by inspecting each state it can be determine if it is a valid LR(0) grammar
Definition of LR(0) Grammar

G is an LR(0) grammar if

The start symbol does not appear on the right side of any productions prefixes K of G where A E is a complete item, then it is unique

i.e., there are no other complete items (and there are no items with a terminal to the right of the dot) that are valid for K
Facts we now know:

Every LR(0) grammar generates a DCFL Every DCFL with the prefix property has a LR(0) grammar Every language with LR(0) grammar have the prefix property L is DCFL iff L has a LR(0) grammar
DPDAs from LR(0) Grammars

We trace out the rightmost derivation in reverse The stack holds a viable prefix (in rightsentential form) and the current state (of the DFA)

Viable prefixes: X1X2Xk States: s1, s2,,sk Stack: s0X1s1Xksk
Reduction

If sk contains A E

Then A E is valid for X1X2Xk E = suffix of X1X2Xk E = Xi+1Xk w such that X1Xkw is a right-sentential form.
Let

Reduction Continued

There is a derivation:

S *rm X1XiAw rm X1Xkw
To obtain the right-sentential form (X1Xkw) in a right derivation we reduce E to A

Therefore, we pop Xi+1Xk from the stack and push A onto the stack
Shift

If sk contains only incomplete items

Then the right-sentential form (X1Xkw) cannot be formed using a reduction
Instead we simply shift the next input symbol onto the stack
Theorem 10.10

If L is L(G) for an LR(0) grammar G, then L is N(M) for a DPDA M

N(M) = the language accepted by empty stack or null stack
Proof

Construct from G the DFA D

Transition function: recognizes Gs prefixes Grammar Symbols of G States of D
Stack Symbols of M are

M has start state q and other states used to perform reduction
We know that:

If G is LR(0) then

Reductions are the only way to get the right-sentential form when the state of the DFA (on the top of the stack) contains a complete item
When M starts on input w it will construct a right-most derivation for w in reverse order
What we need to prove:

When a shift is called for and the top DFA state on the stack has only incomplete items then there are no handles (Note: if there was a handle, then some DFA state on the stack would have a complete item)
Suppose state A E (complete item)

Each state is put onto the top of the stack It would then immediately be reduced to A Therefore, a complete item cannot possibly become buried on the stack
Proof continued

The acceptance of G occurs when the top of the stack contains the start symbol The start symbol by definition of LR(0) grammars cannot appear on the right side of a production @L(G) always has a prefix property if G is LR(0)
Conclusion of Proof

Thus, if w is in L(G), M finds the rightmost derivation of w, reduces w to S, and accepts If M accepts w, then the sequence of right-sentential forms provides a derivation of w from S @N(M) = L(G)
Corollary of Theorem 10.10

Every LR(0) grammar is unambiguous Why?

The rightmost derivation of w is unique

(Given the construction we provided)
LR(1) Grammars

LR grammar with 1 look-ahead All and only deterministic CFLs have LR(1) grammars Are greatly important to compiler design

Why?

Because they are broad enough to include the syntax of almost all programming languages Restrictive enough to have efficient parsers (that are essentially DPDAs)
LR(1) Item

Consists of an LR(0) item followed by a look-ahead set consisting of terminals and/or the special symbol $

$ = the right end of the string A EF, {a1, a2, , an}
General Form:

The set of LR(1) items forms the states of a viable prefix by converting the NFA to a DFA
A grammar is LR(1) if

The start symbol does not appear on the right side of any productions The set of items, I, valid for some viable prefix includes some complete item A E, {a1,,an} then

No ai appears immediately to the right of the dot in any item of I If B F, {b1,,bk} is another complete item in I, then ai { bj for any 1 e i e n and 1 e j e k
Accepting LR(1) language:

Similar to the DPDA used with LR(0) grammars However, it is allowed to use the next input symbol during its decision making This is accomplished by appending a $ to the end of the input and the DPDA keeps the next input symbol as part of the state
LR(1) Rules for Reduce/Shift

If the top set of items has a complete item A E, {a1, a2, , an}, where A { S, reduce by A E if the current input symbol is in {a1, a2, , an} If the top set of items has an item S E, {$}, then reduce by S E and accept if the current symbol is $ (i.e., the end of the input is reached) If the top set of items has an item A EaB, T, and a is the current input symbol, then shift
Regarding the Rules

Guarantees that at most one of the rules will be applied for any input symbol or $ Often for practicality the information is summarized into a table

Rows: sets of items Columns: terminals and $

LR Grammars

Загружено:

Сведения о документе

Исходное описание:

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

LR Grammars

Загружено:

Авторское право:

Доступные форматы

LR-Grammars

LR(0), LR(1), and LR(K)

Deterministic Context-Free Languages

Prefix and Proper Prefix

Prefix (of a string)

Any number of leading symbols of that string Example: abc

Prefixes: I, a, ab, abc

Proper Prefix (of a string)

A prefix of a string, but not the string itself Example: abc

Suffix and Proper Suffix

Suffix (of a string)

This is the grammar that will be used in many of the examples:

Left-to-right scan of the input producing a rightmost derivation Simply:

L stands for Left-to-right R stands for rightmost derivation

An item (for a given CFG)

Given our example grammar:

The items for the grammar are:

A sentential form that can be derived by a rightmost derivation

A string of terminals and variables E is called a sentential form if S* E

(of a right-sentential form K for CFG G) Is a substring F such that:

S *rm HFw HFw = K

If the grammar is unambiguous:

Given our example grammar:

An example right-most derivation:

Therefore we can say that: SaSbc is in right-sentential form

The handle is aSb

Given our example grammar:

The right-sentential form abc:

Computing Sets of Valid Items

Knowing the set of valid items for each prefix K

For every CFG G, the set of viable prefixes is a regular set

NFA M recognizes the viable prefixes for CFG

Q = set of items for G plus state q0

H(q0,I) = {S E| S E is a production} H(A EBF,I) = {B K| B K is a production}

H(A EXF, X) = {A EXF}

Definition of LR(0) Grammar

Facts we now know:

DPDAs from LR(0) Grammars

Viable prefixes: X1X2Xk States: s1, s2,,sk Stack: s0X1s1Xksk

S *rm X1XiAw rm X1Xkw

To obtain the right-sentential form (X1Xkw) in a right derivation we reduce E to A

If sk contains only incomplete items

Then the right-sentential form (X1Xkw) cannot be formed using a reduction

If L is L(G) for an LR(0) grammar G, then L is N(M) for a DPDA M

N(M) = the language accepted by empty stack or null stack

Construct from G the DFA D

Transition function: recognizes Gs prefixes Grammar Symbols of G States of D

Stack Symbols of M are

M has start state q and other states used to perform reduction

What we need to prove:

Suppose  state A E (complete item)

Corollary of Theorem 10.10

Every LR(0) grammar is unambiguous Why?

The rightmost derivation of w is unique

(Given the construction we provided)

$ = the right end of the string A EF, {a1, a2, , an}

Accepting LR(1) language:

LR(1) Rules for Reduce/Shift

Regarding the Rules

Rows: sets of items Columns: terminals and $

Вам также может понравиться

Suppose state A E (complete item)