Lesson 5 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg
2 Outline The sets FIRST and FOLLOW Non-recursive predictive parsing Handling syntax errors
THE SETS FIRST AND FOLLOW 3
Motivation Grammar problematic for predictive parsing: stmt→ func_call | loop func_call → id ( args ) ; loop→ while ( expr ) block | for ( expr ; expr ; expr ) block 4
Motivation stmt→ func_call | loop func_call → id ( args ) ; loop→ while ( expr ) block | for ( expr ; expr ; expr ) block FIRST(func_call) = { id } FIRST(loop) = { while, for } 5
FIRST(α) Simple case: α starts with a terminal a: FIRST(α) = { a } Harder case: α starts with a nonterminal A –Must examine what A can produce If α ⇒ * ε then ε ∊ FIRST(α) 6
Computing FIRST(X) Start with Ø If X is a terminal then add X and return If X ⇒ * ε then add ε For all rules X → Y 1 Y 2...Y k do –For all Y i, where i = 1..k, do Add FIRST(Y i ) except for ε If ε is not in FIRST(Y i ) then break 7
FIRST example (4.30 in the book) Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id FIRST sets: FIRST(E) = { (, id } FIRST(T) = { (, id } FIRST(F) = { (, id } FIRST(E') = { +, ε } FIRST(T') = { *, ε } 8
FIRST example (4.30 in the book) Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id E ⇒ T E' ⇒ F T' E' ⇒ ( E ) T' E' ⇒ … E ⇒ T E' ⇒ F T' E' ⇒ id T' E' ⇒ … FIRST(E) = { (, id } seems correct! 9
FIRST example (4.30 in the book) Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id E' ⇒ + T E' ⇒ … + ∈ FIRST(E') seems correct! T' ⇒ * F T' ⇒ … * ∈ FIRST(T') seems correct! 10
Exercise (1) a)Compute FIRST(K) and FIRST(M): K → K, i : M K → i : M M → M, i M → i b)Compute FIRST(S), FIRST(A), and FIRST(B): S → 1 A : A S → 0 : A A → A B A → ε B → 0 B → 1 11
FOLLOW(A) “What can follow A?” Example grammar: S → a A b A c A → d | e FOLLOW(A) = { b, c } 12
Computing FOLLOW(A) Start with Ø If A is the start symbol then add $ For all rules B → α A β do –Add everything except ε from FIRST(β) For all rules B → α A, or B → α A β where ε ∊ FIRST(β), do –Add everything from FOLLOW(B) 13
FOLLOW example (4.30 in the book) Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id FOLLOW sets: FOLLOW(E) = { $, ) } FOLLOW(E') = { $, ) } FOLLOW(T) = { +, $, ) } FOLLOW(T‘) = { +, $, ) } FOLLOW(F) = { *, +, $, ) } 14
FOLLOW example (4.30 in the book) Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id 15 E $ ⇒ T E' $ ⇒ F T' E' $ ⇒ ( E ) T' E' $ ⇒ ( T E' ) T' E' $ ⇒ … FOLLOW(E) = { $, ) } seems correct! FOLLOW(E') = { $, ) } seems correct!
FOLLOW example (4.30 in the book) Grammar: E → T E' E' → + T E' | ε T → F T' T' → * F T' | ε F → ( E ) | id 16 E $ ⇒ T E' $ ⇒ T + T E' $ ⇒ T + T $ ⇒ T + F T' $ ⇒ T + ( E ) T' $ ⇒ T + ( T E' ) T' $ ⇒ T + ( T ) T' $ ⇒ … FOLLOW(T) = { +, $, ) } seems correct!
Exercise (2) a)Compute FOLLOW(K) and FOLLOW(M): K → K, i : M K → i : M M → M, i M → i b)Compute FOLLOW(S), FOLLOW(A), and FOLLOW(B): S → 1 A : A S → 0 : A A → A B A → ε B → 0 B → 1 17
LL(1) grammars Not left-recursive Not ambiguous For all A → α | β: –FIRST(α) ∩ FIRST(β) = Ø –If ε ∊ FIRST(α) then FOLLOW(A) ∩ FIRST(β) = Ø –If ε ∊ FIRST(β) then FOLLOW(A) ∩ FIRST(α) = Ø 18
NON-RECURSIVE PREDICTIVE PARSING 19
Types of top-down parsers Predictive recursive descent parsers –Lab 1 General recursive descent parsers Non-recursive predictive parsers 20
Non-recursive predictive parsers Keeps a stack of expected symbols Loops: –Pop a symbol X –If X is a terminal, match with lookahead –If X is a nonterminal, predict and push 21
Parse table Encodes predictions: 22 Nonterminal Input symbol id+*()$ EE → T E' E'E' → + T E'E' → ε TT → F T' T'T' → εT' → * F T'T' → ε FF → idF → ( E )
Demonstration Parse the string id * id using the previous parse table 23
Constructing the parse table For each rule A → α do –For each terminal a in FIRST(α) do Write A → α in position M[A, a] –If ε is in FIRST(α) then For each element b in FOLLOW(A) do –Add A → α in position M[A, b] 24
HANDLING SYNTAX ERRORS 25
Types of errors Lexical Syntactic Semantic Logical 26
Handling errors Point out the spot Tell the reason Try to recover and proceed compiling Do not generate code 27
Recovery strategies Panic mode Phrase-level Error productions Global correction 28
Panic mode Discard until synchronizing token What are good synchronizing tokens? Properties: –Simple and fast –Might miss errors in discarded input 29
Phrase-level Try to “fix” the input –Replace a comma by a semicolon –Delete or insert a semicolon –… 30
Error productions Anticipate common errors Add productions for these One variant supported in Bison 31
Global correction Try to find alternative parse tree Minimize corrections Too costly 32
Conclusion The sets FIRST and FOLLOW Definition of LL(1) grammars Non-recursive predictive parsing Handling syntax errors 33
Next time Code generation using syntax-directed translation Lexical analysis 34