Вы находитесь на странице: 1из 11

3.

3 LR(1) Parsers

Compilers
(Procesadores

de Lenguaje)

Escuela Politcnica Superior, UAM

Course 2007-2008

Topic 3.3: LR(1) Parsing


Contents
3.3.1 Problems with SLR(1) parsers
3.3.2 Restricting lookahead sets in LR(1)
3.3.3 Building the LR(1) DFA
3.3.4 Building the LR(1) Parse Table
3.3.5 Lambda in LR(1)
3.3.6 Summary

3.3 LR(1) Parsers


3.3.1 Problems with SLR(1) parsers

Problem case for SLR(1) parser:


 even 1 symbol lookahead not enough
S1

S
S0

S.S$
S .A
S .xb
A .aAb
A .B
B .x

S2
S A.
S3

x
a

Sacc

S S.$

This state
allows both
reduce and shift

S4

B
S5

S6
S x.b
B x.
A a.Ab
A .aAb
A .B
B .x

S xb.

b
A

A aA.b

B
x
AB.

S9

S7

S8
B x.

A aAb.

0 SS$
1 SA
2 Sxb
3 AaAb
4 AB
2
5 Bx

3.3 LR(1) Parsers


3.3.1 Problems with SLR(1) parsers
Detecting SLR(1) conflicts

The grammar is not SLR(1) if both of the following are true:

the DFA for the grammar has a double-circled node with one or more

outward arcs.
the LHS of the reducing production can be followed by any of the
labels of the outward arcs.
S1
S2

S
S0

S.S$
S .A
S .xb
A .aAb
A .B
B .x

S S.$
S A.

S6

S3

S x.b
B x.

x
a

S xb.

A a.Ab
A .aAb
A .B
B .x

S4

S5

AB.

3.3 LR(1) Parsers


a
s0 s4
s1
s2

Problem case for SLR(1) parser:

S6/
r5

s3
S1

S S.$

S
S0

S.S$
S .A
S .xb
A .aAb
A .B
B .x

A
x
a

Sacc

S A.

S4

B
S5

S6
S x.b
B x.
A a.Ab
A .aAb
A .B
B .x

S
1

A B
2 5

r1
r5
7

S xb.

S9

S7

A aA.b

B
x
AB.

x $
s3

s8
s4 s4
r4
r4
s5
r2
s6
s9
s7
r5
r5
s8
r3
r3
s9

S2
S3

S8
B x.

A aAb.

0 SS$
1 SA
2 Sxb
3 AaAb
4 AB
4
5 Bx

3.3 LR(1) Parsers


This should not be a real conflict

We have a conflict because it seems that if the next token is b, either a shift or
reduce is possible.

In fact, with a following b, the reduce operation would not lead to a valid parse.

It is true that b is on the FOLLOW set for B

However, IN THE PARTICULAR CONTEXT OF STATE S3, b could not follow B.

S3 was reached by recognising a single terminal x. (which would reduce to B)

b can only follow B in the context of aAb a has to be encountered first


S2
S0

S.S$
S .A
S .xb
A .aAb
A .B
B .x

S A.

S3

x
a

S4

S6
S x.b
B x.

S xb.

A a.Ab
A .aAb
A .B
B .x

S5

AB.

3.3 LR(1) Parsers


3.3.2 Restricting lookahead sets in LR(1)
LR(1) parsers avoid many s-r conflicts

Rather than using the general FOLLOW list for each nonterminal, LR(1) parsers
calculate which terminals can in fact follow the nonterminal GIVEN the current
state.

In producing the DFA for a grammar, each item is given with the token which
could follow it in that context:

S0

S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}

3.3 LR(1) Parsers


3.3.2 Restricting lookahead sets in LR(1)
LR(1) parsers avoid many s-r conflicts

Rather than using the general FOLLOW list for each nonterminal, LR(1) parsers
calculate which terminals can in fact follow the nonterminal GIVEN the current
state.

In producing the DFA for a grammar, each item is given with the token which
could follow it in that context:

S0

Note change from SLR(1):


$ is now not on RHS of
augmented rule

S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}

The set of nonterminals which


can follow a particular item in
a particular state is called the
lookahead set for this item.

3.3 LR(1) Parsers


LR(1) parsers avoid many s-r conflicts

Rather than using the general FOLLOW list for each nonterminal, they
calculate which terminal can in fact follow the nonterminal GIVEN the
current state.
In producing the DFA for a grammar, each item is given with the single
token which could follow it in that context.
Sacc
S S. {$}

S0

S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}

S
A

S1
S2

x
a

S3

S A. {$}
S5
S x.b {$}
B x. {$}
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}

S xb. {$}

Expansions of A
are followed by b

S4
A B . {$}

3.3 LR(1) Parsers


LR(1) parsers avoid many S-R conflicts

Now, when forming the parse table, we know exactly which terminal can
follow a nonterminal in a given state.
The true Follow set of B in state S2 is {$} not {$b}
The conflict in this case is thus avoided.
Sacc
S S. {$}

S0

S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}

S
A

S1

S A. {$}
S5

S2

S x.b {$}
B x. {$}

x
a

S3

S xb. {$}

A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}

B
S4

A B . {$}

3.3 LR(1) Parsers


LR(1) parsers requires more states

A side effect of associating followers with items is that we cannot merge two
states unless:
1. They share the same closure (as before)
2. Each item has the same Follower (new)
3. Thus, some of the state mergers from the previous example cannot happen
S
Sacc
1
S S. {$}

S0

S.S { $ }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}

S
A

S1
S2

x
a

S3

S A. {$}
S5
S x.b {$}
B x. {$}
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}

S4
A B . {$}

S xb. {$}

Cannot
Merge
S4

A a.Ab {b}
A .aAb {b}
A .B {b}
B .x {b}
10

3.3 LR(1) Parsers


Sacc
S S. {$}

S.S$ { }
S .A {$}
S .xb {$}
A .aAb {$}
A .B {$}
B .x {$}

S0

S1

S2

S A. {$}
S5
S x.b {$}
B x. {$}

S xb. {$}

S7

S4
A B . {b}

B x.{b}

B
S4
A B . {$}

S3

S6

A aA.b {$}

A aA.b {b}

S8

A a.Ab {b}
A .aAb {b}
A .B {b}
B .x {b}

S6

S3
A a.Ab {$}
A .aAb {b}
A .B {b}
B .x {b}

S8

A aAb. {$}

A aAb. {b}

11

3.3 LR(1) Parsers


Compare to the SLR(1) Graph
- 14 states vs. 11

S6

S1

S S.$

S
S0

S.S$
S .A
S .xb
A .aAb
A .B
B .x

S2
S A.
S3

x
a

S S$.

S4

B
S5

S7
S x.b
B x.
A a.Ab
A .aAb
A .B
B .x

S xb.

b
A

A aA.b

A aAb.

B
x
AB.

S10

S8

S9
B x.
12

3.3 LR(1) Parsers


3.3.4 Building the LR(1) Parse Table

s0
s1
s2
s3
s3
S4
s4
S5
s6
s6
s7
S8
s8

a
s3

x
s2

A
1

B
4

6
6

4
4

r1
r5

s5
S3
S3

S
acc

s7
S7
r4
r4

r4
r4
r2

s8
S8
r5
r3
r3

4 new states, but the shiftreduce conflict is avoided.

This lookahead set is a


subset of the full FOLLOW
set of the items LHS. The
LR(1) parser is thus more
selective as to which next
input symbols can cause a
reduce action.

r5
r3
r3

13

3.3 LR(1) Parsers


3.3.4 Building the LR(1) Parse Table

s0
s1
s2
s3
s3
S4
s4
S5
s6
s6
s7
S8
s8

a
s3

x
s2

s7
S7
e
r4
s8
S8
r5
e
r3

S
acc

A
1

B
4

6
6

4
4

r1
r5

s5
S3
S3

r4
e
r2

e
r3
e

Allows more discriminate


reduce contexts

For correct input, the same


number of shifts and
reduces will be performed.
14

3.3 LR(1) Parsers


More than 1 item in LR(1) lookahead sets
In previous examples, the lookahead set of each
item was a single token
However, lookahead sets can have more than one
item.
For instance, in the given grammar:

S:S :A :B :B :-

S $
A B
a x
c
d

The closure for: S :- . A B {$} is:

S :- . A B
A :- . a x

{$}
{c, d}

Occurs when dot is before a sequence of 2


nonterminals, and the second has more than one
starter symbol.
15

3.3 LR(1) Parsers


Another example of building lookahead sets
The closure of E :- . E
1. The augmented rule can only validly be followed by $. The
lookahead set is thus { $ }

E :- . E {$}

2.

E:- E
E :- T
E :- E + T
T :- id
T :- ( E )

We thus expand items for E, and E, being at the end of the rule,
can also be followed only by $:

E :- . E {$}
E :- . T {$}
E :- . E + T {$}

3.

Expanding T:

E
E
E
T
T

:::::-

.
.
.
.
.

E {$}
T {$}
E + T {$}
id {$}
( E ) {$}
16

3.3 LR(1) Parsers

E
E
E
T
T

:::::-

.
.
.
.
.

E {$}
T {$}
E + T {$}
id
{$}
( E ) {$}

E:- E
E :- T
E :- E + T
T :- id
T :- ( E )

4. The E in the 3rd item needs to be expanded. In SLR(1) this was ignored because E
was expanded in an earlier step. But note that the lookahead set is here different:

E
E
E
T
T
E
E
T
T

:::::::::-

.
.
.
.
.
.
.
.
.

E
T
E +
id
( E
T
E +
id
( E

T
)
T
)

{$}
{$}
{$}
{$}
{$}
{+}
{+}
{+}
{+}
17

3.3 LR(1) Parsers


5.

We can simplify by merging items which are identical except for the
lookahead set :

E
E
E
T
T

:::::-

.
.
.
.
.

E
{$}
T
{$, +}
E + T {$, +}
id
{$, +}
( E ) {$, +}

18

3.3 LR(1) Parsers


3.3.5 Lambda in LR(1)

Building the DFA for LR(1) with Lambda


When using a grammar with lambda productions, the
lookahead set needs to take into account the presence of
nonterminals realised by lambda

19

3.3 LR(1) Parsers


S0
IfStmt:- . IfStmt {$}
IfStmt :- . if logexp then Block ElseForm {$}
IfStmt

if

Sacc

S1
IfStmt :- if . logexp then Block ElseForm {$}

IfStmt:- IfStmt . {$}

logexp

S3
IfStmt :- if logexp . then Block ElseForm {$}
then

S4
IfStmt :- if logexp then . Block ElseForm {$}
Block :- . stmt {else $}

S5
stmt

Block :- stmt . {else $}

Block

S6
else
IfStmt :- if logexp then Block . ElseForm {$}
ElseForm :- . else Block {$}
ElseForm :- {$}
1 IfStmt:- IfStmt $
ElseForm

2 IfStmt :- if logexp then Block ElseForm


3 Block :- stmt
4 ElseForm :- else Block
5 ElseForm :-

20

10

3.3 LR(1) Parsers


3.3.4 LR(1) Summary

The LR(1) Technique is the same as for SLR(1) except:


Closures: Each item now records the lookahead set for
this item in this state.
DFA:
a) the items display the lookahead set for each item.
b) states can only merge if lookahead sets on each item
are identical
Parse Table: when placing reduce actions in a row, the
lookahead set for the item is used, rather than the Follow
set for the rules LHS.
Use of Table: exactly the same as for LR(1)
21

3.3 LR(1) Parsers


Problem with LR(1) parsers
Not a complete solution: Not all shift-reduce or reducereduce errors are resolved by the LR(1) technique.

A memory hungry solution:


LR(1) parse tables for real grammars might consume
100s of times more memory than for SLR(1) parsers.
E.g.,
SLR(1) 20 K
LR(1)
2,000,K
While for modern computers this is not a problem, in
earlier days, it was.
More memory efficient solutions were looked for
 LALR Parsers

22

11

Вам также может понравиться