Ambiguity, LL1 Grammars and Table-driven Parsing

Ambiguity, LL1 Grammars and Table-driven Parsing

Problems with Grammars
Not all grammars are usable! Ambiguous Unproductive non-terminals Unreachable rules

Ambiguous Grammar 1 + 2 * 3 E * + D 1 2 3 G is ambiguous if
= { E  D | ( E ) | E + E | E – E | E * E | E / E , D  0 | 1 | … | 9 } 1 + 2 * 3 E * + D 1 2 3 G is ambiguous if there exists S in L(G), such that there are two different parse trees for S E + * D 1 2 3 Multiple meanings: Precedence (1+2)*3≠1+(2*3) Associativity (1-2)-3≠1-(2-3)

Fixing Precedence Ambiguity
E  T | E + T | E – T T  F | T * F | T / F F  D | ( E ) D  0 | 1 | … | 9 = { E  D | ( E ) | E + E | E – E | E * E | E / E , D  0 | 1 | … | 9 } Observe: Operators lower in the parse tree are executed first Operators executed first have higher precedence Fix: Introduce a new non-terminal symbol for each precedence level E E + T T T * F F F D 3 D D 1 2

Adding the Power Operator
E  T | E+T | ET T  P | T*P | T/P P  F | FP F  D | (E) D  0 | 1 | … | 9 E  T | E + T | E – T T  F | T * F | T / F F  D | ( E ) D  0 | 1 | … | 9

Fixing Associative Ambiguity
Left recursion/Left associativity Right recursion/Right associativity E  D | E  D E  D | D  E 2 (3  2)  1 23 = 2  (3  2) E E E  D D  E E  D 2 D  E 1 2 D 3 D 3 2

Unreachable Rules  = {S  aABb , A  a | aA , B  b | bBD , C  cD ,
Initialize the set of reachable non-terminals R with the start symbol. For each round, if R includes the lhs of a production rule, add the non-terminals in the rhs to R. Loop on #2 until there are no changes to R. Rules whose lhs’s are non-terminals in VN minus the non-terminals in R are the set of unreachable rules.  = {S  aABb , A  a | aA , B  b | bBD , C  cD , D  e } Initialize: R = {S} Round 1: R = {S, A, B} Round 2: R = {S, A, B, D} Round 3: R = {S, A, B, D} Done: no change: VN – {S, A, B, D} = {C} = { S  aABb , A  a | aA , B  b | bBD , C  cD , D  e } Least-fixed point algorithm

Unproductive Non-terminals
Start with the set of terminals T. For each round, if T covers a rhs of a production rule, add the lhs to T. Loop on #2 until there are no changes to T. The alphabet of terminals and non-terminals, V, minus T is the set of unproductive non-terminals.  = { S  aABb , A  bC , B  d | dB , C  eC } C never produces all terminals. C  eC  eeC  … enC A also because it always produces C A  bC  beC  … benC S also because it always produce A S  aABb  aAbb  abCbb  …  = { S  aABb , A  bC , B  d | dB , C  eC } Initialize: T = {a, b, d, e} Round 1: T = {a, b, d, e, B} Round 2: T = {a, b, d, e, B} Done: no change: {a, b, d, e, A, B, C, S} – T = {A, C, S} Least-fixed point algorithm

Top-down Parsing with Backtracking
E  N | OEE O  + |  | * | / N  0 | 1 | 2 | 3 | 4 *+342 E N O … +  * 1 2 3 4 Prefix expressions associate an operator with the next two operands E.g., *+324=(2+3)*4, *2+34=2*(3+4)

LL(1) Parsers Problem: Solution: LL(1) parsers:
Never know what production to try (and very inefficient) Solution: LL parser: parses input from Left to right, and constructs a Leftmost derivation of the sentence LL(k) parser uses k tokens of look-ahead LL(1) parsers: Somewhat restrictive, BUT Only need current non-terminal and next token to make parsing decision LL(1) parsers require LL(1) grammars

Simple LL(1) Grammars All rules have the form:
A a11 | a22 | … | ann where ai (1 ≤ i ≤ n) is a terminal ai  aj for i  j i (1 ≤ i ≤ n) is a sequence of terminals and non-terminals, or is empty

Creating Simple LL(1) Grammars
Why is this not a simple LL(1) grammar? E  N | OEE O  + |  | * | / N  0 | 1 | 2 | 3 | 4 How can we change it to simple LL(1)? By making all production rules of the form: A  a11 | a22 | … | ann Thus, E  0 | 1 | 2 | 3 | 4 | +EE | EE | *EE | /EE

LL(1) Parsing E  (1)0 | (2)1 | (3)2 | (4)3 | (5)4 | (6)+EE | (7)EE | (8)*EE | (9)/EE *  2 * 3 E E * E 8  E 7 E + 6 4 5 2 3 E * 8 2 3 3 4 3 4 ? Success! Fail!

Simple LL(1) Parse Table
A parse table is defined as follows: (V  {#})  (VT  {#})  {(, i), pop, accept, error} where  is the right side of production number i # marks the end of the input string (#  V) If A  (V  {#}) is the symbol on top of the stack and a  (VT  {#}) is the current input symbol, then: ACTION(A, a) = pop if A = a for a  VT accept if A = # and a = # (a, i) which means “pop, then push a and output i” (A  a is the ith production) error otherwise

Simple LL(1) Parse Table Example E  (1)0 | (2)1 | (3)2 | (4)3 | (5)+EE | (6)*EE
VT {#} 1 2 3 + * # E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6) pop accept V{#} All blank entries are error

Parse Table Execution: *+123
1 2 3 + * # E (0,1) (1,2) (2,3) (3,4) (+EE,5) (*EE,6) 0,1,2,3,+,* pop accept Action Stack Input Output Initialize E# *+123# ACTION(E,*) = Replace [E,*EE], Out 6 *EE# *+123# 6 ACTION(*,*) = pop(*,*) EE# *+123# 6 ACTION(E,+) = Replace [E,+EE], Out 5 +EEE# *+123# 65 ACTION(+,+) = pop(+,+) EEE# *+123# 65 ACTION(E,1) = Replace [E,1], Out 2 1EE# *+123# 652 ACTION(1,1) = pop(1,1) EE# *+123# 652 ACTION(E,2) = Replace [E,2], Out 3 2E# *+123# 6523 ACTION(2,2) = pop(2,2) E# *+123# 6523 ACTION(E,3) = Replace [E,3], Out 4 3# *+123# 65234 ACTION(3,3) = pop(3,3) # *+123# 65234 ACTION(#,#) = accept Done!

Relaxing Simple LL(1) Restrictions
Consider the following grammar E  (1)N | (2)OEE O  (3)+ | (4)* N  (5)0 | (6)1 | (7)2 | (8)3 Not simple LL(1): rules (1) & (2) However: N leads only to {0, 1, 2, 3} O leads only to {+, *} {0, 1, 2, 3}  {+, *} =   We can distinguish between rules (1) and (2): If we see 0, 1, 2, or 3, we choose (1) If we see + or *, we choose (2)

LL(1) Grammars For any , define FIRST() = { |  * and   VT}
A grammar is LL(1) if for all rules of the form A  1 | 2 | … | n then, FIRST(i)  FIRST(j) =  for i  j (i.e., the sets FIRST(1), FIRST(2), …, and FIRST(n) are pairwise disjoint)

E  (1)N | (2)OEE O  (3)+ | (4)* N  (5)0 | (6)1 | (7)2 | (8)3
LL(1) Parse Table E  (1)N | (2)OEE O  (3)+ | (4)* N  (5)0 | (6)1 | (7)2 | (8)3 For (A, a), we select (, i) if a  FIRST() and  is the right hand side of rule i. VT {#} + * 1 2 3 # E (OEE,2) (N,1) O (+,3) (*,4) N (0,5) (1,6) (2,7) (3,8) pop accept V{#}

Parse Table Execution Revisited: *+123
1 2 3 # E (OEE,2) (N,1) O (+,3) (*,4) N (0,5) (1,6) (2,7) (3,8) +,*,0,1,2,3 pop accept Action Stack Input Output Initialize E# *+123# ACTION(E,*) = Replace [E,OEE], Out 2 OEE# *+123# 2 ACTION(O,*) = Replace [O,*], Out 4 *EE# *+123# 24 ACTION(*,*) = pop(*,*) EE# *+123# 24 ACTION(E,+) = Replace [E,OEE], Out 2 OEEE# *+123# 242 ACTION(O,+) = Replace [O,+], Out 3 +EEE# *+123# 2423 ACTION(+,+) = pop(+,+) EEE# *+123# 2423 ACTION(E,1) = Replace [E,N], Out 1 NEE# *+123# 24231 ACTION(N,1) = Replace [N,1], Out 6 1EE# *+123# 242316 ACTION(1,1) = pop(1,1) EE# *+123# 242316 ACTION(E,2) = Replace [E,N], Out 1 NE# *+123# ACTION(N,2) = Replace [N,2], Out 7 2E# *+123# ACTION(2,2) = pop(2,2) E# *+123# ACTION(E,3) = Replace [E,N], Out 1 N# *+123# ACTION(N,3) = Replace [N,3], Out 8 3# *+123# ACTION(3,3) = pop(3,3) # *+123# ACTION(#,#) = accept Done!

What does mean? E  (1)N | (2)OEE O  (3)+ | (4)* N  (5)0 | (6)1 | (7)2 | (8)3 E (2)OEE (4)* (2)OEE (1)N (3)+ (1)N (1)N (8)3 (6)1 (7)2 defines a parse tree via a preorder traversal

Ambiguity, LL1 Grammars and Table-driven Parsing

Similar presentations

Presentation on theme: "Ambiguity, LL1 Grammars and Table-driven Parsing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Ambiguity, LL1 Grammars and Table-driven Parsing

Similar presentations

Presentation on theme: "Ambiguity, LL1 Grammars and Table-driven Parsing"— Presentation transcript:

Similar presentations

About project

Feedback