03 4 LR1

3.
3 LR(1) Parsers
Compilers
(Procesadores
de Lenguaje)
Escuela Politcnica Superior, UAM
Course 2007-2008
Topic 3.3: LR(1) Parsing

Contents
3.3.1 Problems with SLR(1) parsers
3.3.2 Restricting lookahead sets in LR(1)
3.3.3 Building the LR(1) DFA
3.3.4 Building the LR(1) Parse Table
3.3.5 Lambda in LR(1)
3.3.6 Summary
3.3 LR(1) Parsers

Problem case for SLR(1) parser:

even 1 symbol lookahead not enough
S1
S
S0
S.S$
S .A
S .xb
A .aAb
A .B
B .x
S2
S A.
S3
x
a
Sacc
S S.$
This state
allows both
reduce and shift
S4
B
S5
S6
S x.b
B x.
A a.Ab
A .aAb
A .B
B .x
S xb.
b
A
A aA.b
B
x
AB.
S9
S7
S8
B x.
A aAb.
0 SS$
1 SA
2 Sxb
3 AaAb
4 AB
2
5 Bx
3.3 LR(1) Parsers

Detecting SLR(1) conflicts
The grammar is not SLR(1) if both of the following are true:
the DFA for the grammar has a double-circled node with one or more
outward arcs.
the LHS of the reducing production can be followed by any of the
labels of the outward arcs.
S1
S2
S
S0
S.S$
S .A
S .xb
A .aAb
A .B
B .x
S S.$
S A.
S6
S3
S x.b
B x.
x
a
S xb.
A a.Ab
A .aAb
A .B
B .x
S4
S5
AB.
3.3 LR(1) Parsers

a
s0 s4
s1
s2
Problem case for SLR(1) parser:
S6/
r5
s3
S1
S S.$
S
S0
S.S$
S .A
S .xb
A .aAb
A .B
B .x
A
x
a
Sacc
S A.
S4
B
S5
S6
S x.b
B x.
A a.Ab
A .aAb
A .B
B .x
S
1
A B
2 5
r1
r5
7
S xb.
S9
S7
A aA.b
B
x
AB.
x $
s3
s8
s4 s4
r4
r4
s5
r2
s6
s9
s7
r5
r5
s8
r3
r3
s9
S2
S3
S8
B x.
A aAb.
0 SS$
1 SA
2 Sxb
3 AaAb
4 AB
4
5 Bx
3.3 LR(1) Parsers

This should not be a real conflict
We have a conflict because it seems that if the next token is b, either a shift or
reduce is possible.
In fact, with a following b, the reduce operation would not lead to a valid parse.
It is true that b is on the FOLLOW set for B
However, IN THE PARTICULAR CONTEXT OF STATE S3, b could not follow B.
S3 was reached by recognising a single terminal x. (which would reduce to B)
b can only follow B in the context of aAb a has to be encountered first

S2
S0
S.S$
S .A
S .xb
A .aAb
A .B
B .x
S A.
S3
x
a
S4
S6
S x.b
B x.
S xb.
A a.Ab
A .aAb
A .B
B .x
S5
AB.
3.3 LR(1) Parsers

LR(1) parsers avoid many s-r conflicts
Rather than using the general FOLLOW list for each nonterminal, LR(1) parsers
calculate which terminals can in fact follow the nonterminal GIVEN the current
state.
In producing the DFA for a grammar, each item is given with the token which
could follow it in that context:
S0
S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}
3.3 LR(1) Parsers

Rather than using the general FOLLOW list for each nonterminal, LR(1) parsers
calculate which terminals can in fact follow the nonterminal GIVEN the current
state.
In producing the DFA for a grammar, each item is given with the token which
could follow it in that context:
S0
Note change from SLR(1):

$ is now not on RHS of
augmented rule
S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}
The set of nonterminals which

can follow a particular item in
a particular state is called the
lookahead set for this item.
3.3 LR(1) Parsers

Rather than using the general FOLLOW list for each nonterminal, they
calculate which terminal can in fact follow the nonterminal GIVEN the
current state.
In producing the DFA for a grammar, each item is given with the single
token which could follow it in that context.
Sacc
S S. {$}
S0
S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}
S
A
S1
S2
x
a
S3
S A. {$}
S5
S x.b {$}
B x. {$}
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}
S xb. {$}
Expansions of A
are followed by b
S4
A B . {$}
3.3 LR(1) Parsers

LR(1) parsers avoid many S-R conflicts
Now, when forming the parse table, we know exactly which terminal can
follow a nonterminal in a given state.
The true Follow set of B in state S2 is {$} not {$b}
The conflict in this case is thus avoided.
Sacc
S S. {$}
S0
S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}
S
A
S1
S A. {$}
S5
S2
S x.b {$}
B x. {$}
x
a
S3
S xb. {$}
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}
B
S4
A B . {$}
3.3 LR(1) Parsers

LR(1) parsers requires more states
A side effect of associating followers with items is that we cannot merge two
states unless:
1. They share the same closure (as before)
2. Each item has the same Follower (new)
3. Thus, some of the state mergers from the previous example cannot happen
S
Sacc
1
S S. {$}
S0
S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}
S
A
S1
S2
x
a
S3
S A. {$}
S5
S x.b {$}
B x. {$}
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}
S4
A B . {$}
S xb. {$}
Cannot
Merge
S4
A a.Ab {b}
A .aAb {b}
A .B {b}
B .x {b}
10
3.3 LR(1) Parsers

Sacc
S S. {$}
S.S$ { }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}
S0
S1
S2
S A. {$}
S5
S x.b {$}
B x. {$}
S xb. {$}
S7
S4
A B . {b}
B x.{b}
B
S4
A B . {$}
S3
S6
A aA.b {$}
A aA.b {b}
S8
A a.Ab {b}
A .aAb {b}
A .B {b}
B .x {b}
S6
S3
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}
S8
A aAb. {$}
A aAb. {b}
11
3.3 LR(1) Parsers

Compare to the SLR(1) Graph
- 14 states vs. 11
S6
S1
S S.$
S
S0
S.S$
S .A
S .xb
A .aAb
A .B
B .x
S2
S A.
S3
x
a
S S$.
S4
B
S5
S7
S x.b
B x.
A a.Ab
A .aAb
A .B
B .x
S xb.
b
A
A aA.b
A aAb.
B
x
AB.
S10
S8
S9
B x.
12
3.3 LR(1) Parsers

s0
s1
s2
s3
s3
S4
s4
S5
s6
s6
s7
S8
s8
a
s3
x
s2
A
1
B
4
6
6
4
4
r1
r5
s5
S3
S3
S
acc
s7
S7
r4
r4
r4
r4
r2
s8
S8
r5
r3
r3
4 new states, but the shiftreduce conflict is avoided.
This lookahead set is a

subset of the full FOLLOW
set of the items LHS. The
LR(1) parser is thus more
selective as to which next
input symbols can cause a
reduce action.
r5
r3
r3
13
3.3 LR(1) Parsers

s0
s1
s2
s3
s3
S4
s4
S5
s6
s6
s7
S8
s8
a
s3
x
s2
s7
S7
e
r4
s8
S8
r5
e
r3
S
acc
A
1
B
4
6
6
4
4
r1
r5
s5
S3
S3
r4
e
r2
e
r3
e
Allows more discriminate

reduce contexts
For correct input, the same

number of shifts and
reduces will be performed.
14
3.3 LR(1) Parsers

More than 1 item in LR(1) lookahead sets
In previous examples, the lookahead set of each
item was a single token
However, lookahead sets can have more than one
item.
For instance, in the given grammar:
S:S :A :B :B :-
S $
A B
a x
c
d
The closure for: S :- . A B {$} is:
S :- . A B
A :- . a x
{$}
{c, d}
Occurs when dot is before a sequence of 2

nonterminals, and the second has more than one
starter symbol.
15
3.3 LR(1) Parsers

Another example of building lookahead sets
The closure of E :- . E
1. The augmented rule can only validly be followed by $. The
lookahead set is thus { $ }
E :- . E {$}
2.
E:- E
E :- T
E :- E + T
T :- id
T :- ( E )
We thus expand items for E, and E, being at the end of the rule,
can also be followed only by $:
E :- . E {$}
E :- . T {$}
E :- . E + T {$}
3.
Expanding T:
E
E
E
T
T
:::::-
.
.
.
.
.
E {$}
T {$}
E + T {$}
id {$}
( E ) {$}
16
3.3 LR(1) Parsers
E
E
E
T
T
:::::-
.
.
.
.
.
E {$}
T {$}
E + T {$}
id
{$}
( E ) {$}
E:- E
E :- T
E :- E + T
T :- id
T :- ( E )
4. The E in the 3rd item needs to be expanded. In SLR(1) this was ignored because E
was expanded in an earlier step. But note that the lookahead set is here different:
E
E
E
T
T
E
E
T
T
:::::::::-
.
.
.
.
.
.
.
.
.
E
T
E +
id
( E
T
E +
id
( E
T
)
T
)
{$}
{$}
{$}
{$}
{$}
{+}
{+}
{+}
{+}
17
3.3 LR(1) Parsers

5.
We can simplify by merging items which are identical except for the
lookahead set :
E
E
E
T
T
:::::-
.
.
.
.
.
E
{$}
T
{$, +}
E + T {$, +}
id
{$, +}
( E ) {$, +}
18
3.3 LR(1) Parsers

3.3.5 Lambda in LR(1)
Building the DFA for LR(1) with Lambda

When using a grammar with lambda productions, the
lookahead set needs to take into account the presence of
nonterminals realised by lambda
19
3.3 LR(1) Parsers

S0
IfStmt:- . IfStmt {$}
IfStmt :- . if logexp then Block ElseForm {$}
IfStmt
if
Sacc
S1
IfStmt :- if . logexp then Block ElseForm {$}
IfStmt:- IfStmt . {$}
logexp
S3
IfStmt :- if logexp . then Block ElseForm {$}
then
S4
IfStmt :- if logexp then . Block ElseForm {$}
Block :- . stmt {else $}
S5
stmt
Block :- stmt . {else $}
Block
S6
else
IfStmt :- if logexp then Block . ElseForm {$}
ElseForm :- . else Block {$}
ElseForm :- {$}
1 IfStmt:- IfStmt $
ElseForm
2 IfStmt :- if logexp then Block ElseForm

3 Block :- stmt
4 ElseForm :- else Block
5 ElseForm :-
20
10
3.3 LR(1) Parsers

3.3.4 LR(1) Summary
The LR(1) Technique is the same as for SLR(1) except:

Closures: Each item now records the lookahead set for
this item in this state.
DFA:
a) the items display the lookahead set for each item.
b) states can only merge if lookahead sets on each item
are identical
Parse Table: when placing reduce actions in a row, the
lookahead set for the item is used, rather than the Follow
set for the rules LHS.
Use of Table: exactly the same as for LR(1)
21
3.3 LR(1) Parsers

Problem with LR(1) parsers
Not a complete solution: Not all shift-reduce or reducereduce errors are resolved by the LR(1) technique.
A memory hungry solution:

LR(1) parse tables for real grammars might consume
100s of times more memory than for SLR(1) parsers.
E.g.,
SLR(1) 20 K
LR(1)
2,000,K
While for modern computers this is not a problem, in
earlier days, it was.
More memory efficient solutions were looked for
LALR Parsers
22
11

03 4 LR1

Загружено:

Сведения о документе

Оригинальное название

Авторское право

Доступные форматы

Поделиться этим документом

Поделиться или встроить документ

Параметры публикации

Этот документ был вам полезен?

Это неприемлемый материал?

Авторское право:

Доступные форматы

03 4 LR1

Загружено:

Авторское право:

Доступные форматы

3.

Escuela Politcnica Superior, UAM

Topic 3.3: LR(1) Parsing

3.3 LR(1) Parsers

Problem case for SLR(1) parser:

3.3 LR(1) Parsers

The grammar is not SLR(1) if both of the following are true:

3.3 LR(1) Parsers

Problem case for SLR(1) parser:

3.3 LR(1) Parsers

It is true that b is on the FOLLOW set for B

However, IN THE PARTICULAR CONTEXT OF STATE S3, b could not follow B.

S3 was reached by recognising a single terminal x. (which would reduce to B)

b can only follow B in the context of aAb a has to be encountered first

3.3 LR(1) Parsers

3.3 LR(1) Parsers

Note change from SLR(1):

The set of nonterminals which

3.3 LR(1) Parsers

3.3 LR(1) Parsers

3.3 LR(1) Parsers

3.3 LR(1) Parsers

3.3 LR(1) Parsers

3.3 LR(1) Parsers

4 new states, but the shiftreduce conflict is avoided.

This lookahead set is a

3.3 LR(1) Parsers

Allows more discriminate

For correct input, the same

3.3 LR(1) Parsers

The closure for: S :- . A B {$} is:

Occurs when dot is before a sequence of 2

3.3 LR(1) Parsers

3.3 LR(1) Parsers

3.3 LR(1) Parsers

3.3 LR(1) Parsers

Building the DFA for LR(1) with Lambda

3.3 LR(1) Parsers

IfStmt:- IfStmt . {$}

Block :- stmt . {else $}

2 IfStmt :- if logexp then Block ElseForm

3.3 LR(1) Parsers

The LR(1) Technique is the same as for SLR(1) except:

3.3 LR(1) Parsers

A memory hungry solution:

Вам также может понравиться