Download presentation
Presentation is loading. Please wait.
1
COS 320 Compilers David Walker
2
last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars recursive descent parsers (Appel 3.2) –parse LL(k) grammars –easy to write as ML programs –algorithms for automatic construction from a CFG
3
1. S ::= IF E THEN S ELSE S 2. | BEGIN S L 3. | PRINT E 4. L ::= END 5. | ; S L 6. E ::= NUM = NUM non-terminals: S, E, L terminals: NUM, IF, THEN, ELSE, BEGIN, END, PRINT, ;, = rules: fun S () = case !tok of IF => eat IF; E (); eat THEN; S (); eat ELSE; S () | BEGIN => eat BEGIN; S (); L () | PRINT => eat PRINT; E () and L () = case !tok of END => eat END | SEMI => eat SEMI; S (); L () and E () = eat NUM; eat EQ; eat NUM val tok = ref (getToken ()) fun advance () = tok := getToken () fun eat t = if (! tok = t) then advance () else error () datatype token = NUM | IF | THEN | ELSE | BEGIN | END | PRINT | SEMI | EQ
4
Constructing RD Parsers To construct an RD parser, we need to know what rule to apply when –we have seen a non terminal X –we see the next terminal a in input We apply rule X ::= s when –a is the first symbol that can be generated by string s, OR –s reduces to the empty string (is nullable) and a is the first symbol in any string that can follow X
5
Computing Nullable Sets Non-terminal X is Nullable only if the following constraints are satisfied (computed using iterative analysis) –base case: if (X := ) then X is Nullable –iterative/inductive case: if (X := ABC...) and A, B, C,... are all Nullable then X is Nullable
6
Computing First Sets First(X) is computed iteratively –base case: if T is a terminal symbol then First (T) := {T} Otherwise First (T) := { } –iterative/inductive case: if X is a non-terminal and (X:= ABC...) then –First (X) := First (X) U First (ABC...) where First(ABC...) = F1 U F2 U F3 U... and »F1 = First (A) »F2 = First (B), if A is Nullable »F3 = First (C), if A is Nullable & B is Nullable »...
7
Computing Follow Sets Follow(X) is computed iteratively –base case: initially, we assume nothing in particular follows X –Follow (X) := { } for all X –inductive case: if (Y := s1 X s2) for any strings s1, s2 then –Follow (X) := First (s2) U Follow (X) if (Y := s1 X s2) for any strings s1, s2 then –Follow (X) := Follow(Y) U Follow (X), if s2 is Nullable
8
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Z Y X
9
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Zno Yyes Xno base case
10
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Zno Yyes Xno after one round of induction, we realize we have reached a fixed point
11
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Zno{ } Yyes{ } Xno{ } base case
12
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod Yyesc Xnoa,b round 1, no fixed point
13
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b Yyesc Xnoa,b round 2, no fixed point
14
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b Yyesc Xnoa,b round 3, no more changes ==> fixed point
15
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesc{ } Xnoa,b{ } base case
16
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,d,a,b after one round of induction, no fixed point
17
building a predictive parser Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,d,a,b after two rounds, fixed point (but notice, computing Follow(X) before Follow (Y) would have required 3 rd round)
18
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,d,a,b Grammar: Computed Sets: Build parsing table where row X, col T tells parser which clause to execute in function X with next-token T: abcde Z Y X if T First(s) then enter (X ::= s) in row X, col T if s is Nullable and T Follow(X) enter (X ::= s) in row X, col T
19
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Build parsing table where row X, col T tells parser which clause to execute in function X with next-token T: abcde ZZ ::= XYZ Y X if T First(s) then enter (X ::= s) in row X, col T if s is Nullable and T Follow(X) enter (X ::= s) in row X, col T
20
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Build parsing table where row X, col T tells parser which clause to execute in function X with next-token T: abcde ZZ ::= XYZ Z ::= d Y X if T First(s) then enter (X ::= s) in row X, col T if s is Nullable and T Follow(X) enter (X ::= s) in row X, col T
21
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Build parsing table where row X, col T tells parser which clause to execute in function X with next-token T: abcde ZZ ::= XYZ Z ::= d YY ::= c X if T First(s) then enter (X ::= s) in row X, col T if s is Nullable and T Follow(X) enter (X ::= s) in row X, col T
22
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Build parsing table where row X, col T tells parser which clause to execute in function X with next-token T: abcde ZZ ::= XYZ Z ::= d YY ::= Y ::= cY ::= X if T First(s) then enter (X ::= s) in row X, col T if s is Nullable and T Follow(X) enter (X ::= s) in row X, col T
23
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Build parsing table where row X, col T tells parser which clause to execute in function X with next-token T: abcde ZZ ::= XYZ Z ::= d YY ::= Y ::= cY ::= XX ::= aX ::= b Y e if T First(s) then enter (X ::= s) in row X, col T if s is Nullable and T Follow(X) enter (X ::= s) in row X, col T
24
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: What are the blanks? abcde ZZ ::= XYZ Z ::= d YY ::= Y ::= cY ::= XX ::= aX ::= b Y e
25
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: What are the blanks? --> syntax errors abcde ZZ ::= XYZ Z ::= d YY ::= Y ::= cY ::= XX ::= aX ::= b Y e
26
Z ::= X Y Z Z ::= d Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Is it possible to put 2 grammar rules in the same box? abcde ZZ ::= XYZ Z ::= d YY ::= Y ::= cY ::= XX ::= aX ::= b Y e
27
Z ::= X Y Z Z ::= d Z ::= d e Y ::= c Y ::= X ::= a X ::= b Y e nullablefirstfollow Znod,a,b{ } Yyesce,d,a,b Xnoa,bc,e,d,a,b Grammar: Computed Sets: Is it possible to put 2 grammar rules in the same box? abcde ZZ ::= XYZ Z ::= d Z ::= d e YY ::= Y ::= cY ::= XX ::= aX ::= b Y e
28
predictive parsing tables if a predictive parsing table constructed this way contains no duplicate entries, the grammar is called LL(1) –Left-to-right parse, Left-most derivation, 1 symbol lookahead if not, of the grammar is not LL(1) in LL(k) parsing table, columns include every k- length sequence of terminals: aaabbabbacca...
29
another trick Previously, we saw that grammars with left-recursion were problematic, but could be transformed into LL(1) in some cases the example non-LL(1) grammar we just saw: how do we fix it? Z ::= X Y Z Z ::= d Z ::= d e Y ::= c Y ::= X ::= a X ::= b Y e
30
another trick Previously, we saw that grammars with left-recursion were problematic, but could be transformed into LL(1) in some cases the example non-LL(1) grammar we just saw: solution here is left-factoring: Z ::= X Y Z Z ::= d Z ::= d e Y ::= c Y ::= X ::= a X ::= b Y e Z ::= X Y Z Z ::= d W Y ::= c Y ::= X ::= a X ::= b Y e W ::= W ::= e
31
summary of RD parsing CFGs are good at specifying programming language structure parsing general CFGs is expensive so we define parsers for simpler classes of CFG –LL(k), LR(k) we can build a recursive descent parser for LL(k) grammars by: –computing nullable, first and follow sets –constructing a parse table from the sets –checking for duplicate entries, which indicates failure –creating an ML program from the parse table if parser construction fails we can –rewrite the grammar (left factoring, eliminating left recursion) and try again –try to build a parser using some other method
32
summary of RD parsing CFGs are good at specifying programming language structure parsing general CFGs is expensive so we define parsers for simpler classes of CFG –LL(k), LR(k) we can build a recursive descent parser for LL(k) grammars by: –computing nullable, first and follow sets –constructing a parse table from the sets –checking for duplicate entries, which indicates failure –creating an ML program from the parse table if parser construction fails we can –rewrite the grammar (left factoring, eliminating left recursion) and try again –try to build a parser using some other method...such as using a bottom- up parsing technique
33
Bottom-up (Shift-Reduce) Parsing
34
shift-reduce parsing –aka: bottom-up parsing –aka: LR(k) Left-to-right parse, Rightmost derivation, k-token lookahead more powerful than LL(k) parsers LALR variant: –the basis for parsers for most modern programming languages –implemented in tools such as ML-Yacc
35
Shift-reduce algorithm Parser keeps track of –position in current input (what input to read next) –a stack of terminal & non-terminal symbols representing the “parse so far” Based on next input symbol & stack, parser table indicates –shift: push next input on to top of stack –reduce R: top of stack should match RHS of rule replace top of stack with LHS of rule –error –accept (we shift EOF & can reduce what remains on stack to start symbol)
36
shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Parsing Table
37
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: yet to read Parsing Table
38
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( yet to read Parsing Table SHIFT
39
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( id yet to read Parsing Table SHIFT
40
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( id = yet to read Parsing Table SHIFT
41
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( id = num yet to read Parsing Table SHIFT
42
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( S yet to read Parsing Table REDUCE S ::= id = num
43
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( L yet to read Parsing Table REDUCE L ::= S
44
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( L ; yet to read Parsing Table SHIFT
45
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( L ; id = num yet to read Parsing Table SHIFT
46
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( L ; S yet to read Parsing Table REDUCE S ::= id = num
47
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( L yet to read Parsing Table REDUCE S ::= L ; S
48
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: ( L ) yet to read Parsing Table SHIFT
49
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: S yet to read Parsing Table REDUCE S ::= ( L )
50
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: A Parsing Table SHIFT REDUCE A ::= S EOF ACCEPT
51
State of parse so far: ( id = num ; id = num ) EOF shift-reduce parsing example A ::= S EOFL ::= L ; S L ::= S Grammar: S ::= ( L ) S ::= id = num Input from lexer: A Parsing Table A successful parse! Is this grammar LL(1)?
52
Shift-reduce algorithm Parser keeps track of –position in current input (what input to read next) –a stack of terminal & non-terminal symbols representing the “parse so far” Based on next input symbol & stack, parser table indicates –shift: push next input on to top of stack –reduce R: top of stack should match RHS of rule replace top of stack with LHS of rule –error –accept (we shift EOF & can reduce what remains on stack to start symbol) Reinterpreting the entire stack on every iteration would be very slow –O(averageStackSize * input) –need optimized algorithm that only looks at top of stack (plus parsing table to figure out what to do. O(input)
53
Shift-reduce algorithm (details) The parser summarizes the current “parse state” using an integer –the integer is actually a state in a finite automaton –the current parse state can be computed by running the automaton over the current parse stack Revised algorithm: Based on next input symbol & the parse state (as opposed to the entire stack), parser table indicates –shift s: push next input on to top of stack and move automaton into state s –reduce R & goto s: top of stack should match RHS of rule replace top of stack with LHS of rule move automaton into state s build parse tree corresponding to reduction R –error –accept
54
State of parse so far: ???? ???? EOF shift-reduce parsing Grammar: Input from lexer: ???? Like LL parsing, shift-reduce parsing does not always work. What sort of grammar rules make shift-reduce parsing impossible? ????
55
State of parse so far: ???? z??? EOF shift-reduce parsing Grammar: Input from lexer: ??cd Like LL parsing, shift-reduce parsing does not always work. Shift-Reduce errors: can’t decide whether to Shift z or Reduce cd by a rule Reduce-Reduce errors: can’t decide whether to Reduce by R1 or R2 ????
56
State of parse so far: shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: ???? ???? EOF ???? notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing
57
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S + S reduce by rule (S ::= S + S) or shift the * ??? notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing SS+ id parse tree so far:
58
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S reduce by rule (S ::= S + S) notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ reduce: id
59
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S * reduce by rule (S ::= S + S) notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ shift:
60
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S * id reduce by rule (S ::= S + S) notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ shift:
61
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S * S reduce by rule (S ::= S + S) notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ reduce: S id
62
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S reduce by rule (S ::= S + S) notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ reduce: S id * S
63
State of parse so far: id + id * id EOF alternative parse A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S + S reduce by rule (S ::= S + S) or shift the * notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S+ S id
64
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S + S * reduce by rule (S ::= S + S) or shift the * notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S+ S shift: id *
65
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S + S * id reduce by rule (S ::= S + S) or shift the * notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing shift: S+ S id *
66
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S + S * S reduce by rule (S ::= S + S) or shift the * notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing reduce: S+ S id * S
67
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S + S reduce by rule (S ::= S + S) or shift the * notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing reduce: S+ S id * S S
68
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer: S reduce by rule (S ::= S + S) or shift the * notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ SS* reduce:
69
State of parse so far: id + id * id EOF shift-reduce errors A ::= S EOF Grammar: S ::= S + S S ::= S * S S ::= id Input from lexer:S + S notice, this is an ambiguous grammar – we are always going to need some mechanism for resolving the outstanding ambiguity before parsing S SS+ SS* reduce by rule (S ::= S + S) :reduce shift the *: S SS+ S id * S
70
State of parse so far: shift-reduce errors A ::= S id EOF Grammar: E ::= E ; E E ::= id Input from lexer: id ; id EOF E ; S ::= E ; reduce by rule (S ::= E ;) or shift the id id ; id ; id EOF input might be this, making shifting correct
71
State of parse so far: ( id ) EOF reduce-reduce errors A ::= S EOF Grammar: S ::= ( E ) S ::= E Input from lexer: ( E ) reduce by rule ( S ::= ( E ) ) or reduce by rule ( E ::= ( E ) ) E ::= ( E ) E ::= E + E E ::= id
72
Summary Top-down Parsing –simple to understand and implement –you can code it yourself using nullable, first, follow sets –excellent for quick & dirty parsing jobs Bottom-up Parsing –more complex: uses stack & table –more powerful –Bonus: tools do the work for you ==> ML-Yacc but you need to understand how shift-reduce & reduce- reduce errors can arise
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.