Вы находитесь на странице: 1из 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

CSCI 565 Compiler Design


Spring 2010 Homework 2 - Solution
Problem 1 [10 points]: Predictive Top-Down Parsing
Explain why a left-recursive grammar cannot be parsed using the predictive top-down parsing algorithms. Answer: The predictive parsing method, either based on a set of recursive functions or on its nonrecursive variant using a table, rest on the premise that whenever a given terminal is seen the algorithm must choose the correct production. This means that for each terminal symbol there must be a unique production that allow the algorithm to see or guess the production correctly. Furthermore, whenever a terminal is seen the corresponding production is expanded as it corresponds to the parse tree being expanded. On a left-recursive grammar there is a possibility that the algorithm will try to expand a production without consuming any of its input. In this case, guessing right does not help as it due to the recursion the tree is continuously being expanded without any advanced in the input symbols. Thus the parsing process never terminates.

Problem 2 [30 points]: Table-based LL(1) Predictive Top-Down Parsing Consider the following CFG G = (N={S, A, B, C, D}, T={a,b,c,d}, P, S) where the set of productions P is given below: SA A BC | DBC B Bb | Cc| Da|d
a) Is this grammar suitable to be parsed using the recursive descendent parsing method? Justify and modify the grammar if needed. b) Compute the FIRST and FOLLOW set of non-terminal symbols of the grammar resulting from your answer in a) c) Construct the corresponding parsing table using the predictive parsing LL method. d) Show the stack contents, the input and the rules used during parsing for the input w = dbb Answers: a) No because it is left-recursive. You can expand B using a production with B as the left-most symbol without consuming any of the input terminal symbols. To eliminate this left recursion we add another non-terminal symbol, B and productions as follows:

1 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

S A A BC | DBC B bB | B bB | C c| D a|d
b) FIRST(S) FIRST(A) FIRST(B) FIRST(B) FIRST(C) FIRST(D) = { a, b, c, d, } = { a, b, c, d, } = { b, } = { b, } = { c, } = { a, d } FOLLOW(S) = { $ } FOLLOW(A) = { $ } FOLLOW(B) = { c, $ } FOLLOW(B) ={ c, $ } FOLLOW(C) = { $ } FOLLOW(D) = { b, c, $ }

Non-terminals A, B, B, C and S are all nullable. c) The parsing table is as shown below:

d) The stack and input are as shown below using the predictive, table-driven parsing algorithm: STACK INPUT RULE/OUTPUT

$S $A $ CBD $ CBd $ CB $ CBb $ CB $ CBb $ CB $C $ $

dbb$ dbb$ dbb$ dbb$ bb$ bb$ b$ b$ $ $ $ $

SA A DBC Dd B bB B bB B C halt or accept

Problem 3 [20 points]: LL parsing using Mutually Recursive Functions


2 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

Consider the following context-free grammar. (It corresponds roughly to the syntax of lists in the programming language LISP.) S is the start symbol, and the terminals are a , ( ).
S() Sa S(A) AS AA,S

(a) Show precisely why this grammar is not LL(1). (Hint: This will require computing some, but not all, of the FIRST and FOLLOW sets.) (b) Rewrite this grammar to make it suitable for recursive descent parsing. (c) On the basis of your revised grammar from part (b), write a recursive procedure S that parses this grammar by recursive descent. Your procedure may be written in C, C++, Java or pseudo-code. You may assume the existence of a global variable token that holds the next input token, a function advance() that reads the next token into token, and a function error that may be called if the input is not in the language generated by the grammar.
Answers:

(a). A grammar is LL(1) if and only if its predictive parsing table has no multiply-defined entries. Consider the right-hand sides of the first and third productions for S. The terminal ( is in FIRST(()) and also in FIRST((A)). Therefore the table entry for the row labeled S and the column labeled ( will have (at least) two entries for these two productions. So the grammar cannot be LL(1). (Note that there was no need to calculate any FOLLOW() sets after all!) (b) This requires removing left-recursion and left-factoring: S ( S Sa S ) S A ) A SA A ,SA A
(c)

Heres C/Java-like code:


void s() { if (token == () { advance(); s1(); } else if (token == a) advance(); else error(); } void s1() { if (token == )) advance(); else { a(); if (token == )) advance(); else error(); }}
3 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

void a() { s(); a1(); }

void a1() { if (token == ,) { advance(); s(); a1(); } }

Problem 4 [40 points]: LR Parsing Algorithm Given the grammar below already augmented with the basic EOF production (0) answer the following questions: (0) (1) (2) (3) (4) (5) (6) (7) a) b) c) d) e) f) g) S Stmts Stmts Stmt Var Var E E Stmts Stmt Stmts Var = id [E id id ( E ) $ ; Stmt E ]

Construct the set of LR(0) items and the DFA capable of recognizing it. Construct the LR(0) parsing table and determine if this grammar is LR(0). Justify. Is the SLR(0) DFA for this grammar the same as the LR(0) DFA? Why? Is this grammar SLR(0)? Justify by constructing its table. Construct the set of LR(1) items and the DFA capable of recognizing it. Construct the LR(1) parsing table and determine if this grammar is LR(1). Justify. How would you derive the LALR(1) parsing table this grammar? What is the difference between this table and the table found in a) above?

Answers:
a) [5 points] Construct the set of LR(0) items and the DFA capable of recognizing it. The figure below depicts the FA that recognizes the set of valid LR(0) items for this grammar.

4 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

b) [5 points] Construct the LR(0) parsing table and determine if this grammar is LR(0). Justify. Based on the DFA above we derive the LR parsing table below where we noted a shift/reduce conflict in state 3. In this state the presence of a [ indicates that the parse can either reduce using the production 5 or shift by advancing to state s6. Note that by reducing it would then be left in a state possible s0 where the presence of the [ would lead to an error. Clearly, this grammar is not suitable for the LR(0) parsing method.
State
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

id
s3 r(1) r(5)

;
s13 r(1) r(5)

=
r(1) r(5) s6

Terminals [ ]
r(1) s6/r(5) r(1) r(5)

(
r(1) r(5)

)
r(1) r(5)

$
s5 r(1) r(5) acc

Stmts
g1

Goto Stmt E
g2

Var
g4

s8 s8 r(6) s8 r(4) r(7) s3 r(2) r(3)

r(6)

r(6)

r(6)

r(6) s11 r(4) r(7) r(2) r(3)

s9 s9 r(6) s9 r(4) r(7) r(2) r(3)

g16 g10 r(6) r(6) g12 r(4) s13 r(7) r(2) r(3) r(4) r(7) g15 g4 r(2) r(3)

r(4) r(7) r(2) r(3)

r(4) r(7) r(2) r(3)

r(4) r(7) r(2) r(3)

5 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

c) [5 points] Is the SLR(1) DFA for this grammar the same as the LR(0) DFA? Why? The same. The states and transitions are the same as only the procedure to build the parse table is different. For this method of construction of the parsing table we include the production reduce A for all terminals a in FOLLOW(A). The table below is the resulting parse table using the SLR table construction algorithm, also known as SLR(1) although it uses the DFA constructed using the LR(0) items. For this specific grammar the FOLLOW set is as shown below: FOLLOW(Stmts) = { $, ; } FOLLOW(Stmt) = { $, ; } FOLLOW(E) = { $, ; , ] , ) } FOLLOW(Var) = { = }
State
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

id
s3

;
s13 r(1)

Terminals [ ]

$
s5 r(1)

Stmts
g1

Goto Stmt E
g2

Var
g4

r(5) s6 s8 s8 r(6) s8

s6 acc s9 s9 r(6) s9 s11 r(6) r(6) g12 g16 g10

r(4) r(7) r(2) r(3) r(7) s13 r(7) r(7) g15 r(2) r(3) g4

Notice that because we have used the FOLLOW of Var to limit the use of the reduction action for this table we have eliminated the shit/reduce conflict in this grammar. d) [5 points] Is this grammar SLR(1)? Justify by constructing its table. As can be seen in state 3 there is no longer a shift/reduce conflict. Essentially a single look-ahead symbol is enough to distinguish a single action to take in any context. e) [10 points] Construct the set of LR(1) items and the DFA capable of recognizing them.

6 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

As can be seen there number of states in this new DFA is much larger when compared to the DFA that recognizes the LR(0) sets of items.

7 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

f) [5 points] Construct the LR(1) parsing table and determine if this grammar is LR(1). Justify.
State
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

id
s7

;
s3

Terminals [ ]

$
s2 acc

Stmts
g1

Goto Stmt E
g5

Var
g6

s7 r(2) r(1) s8 r(5) s11 s14 s13 r(6) r(3) r(6) r(6) s13 s13 s15 s15 s21 s22 s23 s24 r(7) r(7) r(4) r(7) r(7) s9 s10 s16 s15 r(6) r(3) r(2) r(1)

g4

g6

g12 g20 g17

g18 g19

Clearly, and as with the SLR(1) table construction method there are no conflicts in this parse table and the grammar is therefore LR(1). g) [5 points] How would you derive the LALR(1) parsing table this grammar? What is the difference between this table and the table found in a) above? There are many states with very similar core items that differ only on the look-ahead and can thus be merged as suggested by the procedure to construct LALR parsing tables. There are many states with identical core items thus differing only in the look-ahead and can thus be merged as suggested by the procedure to construct LALR parsing tables. For instance states {s10, s15, s16} could be merged into a single state named s40. The same is true for states in the sets s41={s17, s18, s19}, s42={s21, s22, s23} and s43={s11, 13, s14} thus substantially reducing the number of states as it will be seen in the next point in this exercise. The table below reflects these merge operations resulting in a much smaller table which is LALR as there are no conflicts due to the merging of states with the same core items.

8 of 9

University of Southern California

CSCI565 Compiler Design

Homework 2 - Solution

State
0 1 2 3 4 5 6 7 8 9 40 41 42 43

id
s7

;
s3

Terminals [ ]

$
s2 acc

Stmts
g1

Goto Stmt E
g5

Var
g6

s7 r(2) r(1) s8 r(5) s43 s43 s43 r(7) r(6) r(7) r(6) s9 s40 s40 s40 s42 r(7) r(6) r(1)

g4

g6

g12 g20 g41

r(6)

After this state simplification we get the DFA below. This DFA is identical to the first DFA found using the LR(0) items but the additional information on the look-ahead token allows for a better table parsing construction method that does not have shift/reduce conflicts.

9 of 9

Вам также может понравиться