# Lecture # 11 Grammar Problems.

## Presentation on theme: "Lecture # 11 Grammar Problems."— Presentation transcript:

Lecture # 11 Grammar Problems

Left Recursion A grammar is left recursive if for a non-terminal A, there is a derivation A+ A There are three types of left recursion: direct (A  A x) indirect (A  B C, B  A ) hidden (A  B A, B  )

How to eliminate Left recursion?
To eliminate direct left recursion replace A  A1 | A2 | ... | Am | 1 | 2 | ... | n with A  1B | 2B | ... | nB B  1B | 2B | ... | mB | 

T  F F  E*F F  id There is direct recursion: EE+T There is indirect recursion: TE+T, ET Algorithm for eliminating indirect recursion List the nonterminals in some order A1, A2, ...,An for i=1 to n for j=1 to i-1 if there is a production AiAj, replace Aj with its rhs eliminate any direct left recursion on Ai

Eliminating indirect left recursion
ordering: S, E, T, F i=S i=E i=T, j=E S  E E  E+T E  T T  E-T T  F F  E*F F  id S  E E  E+T E  T T  E-T T  F F  E*F F  id S  E E  TE' E'+TE'| T  E-T T  F F  E*F F  id S  E E  TE' E'+TE'| T  TE'-T T  F F  E*F F  id S  E E  TE' E'+TE'| T  FT' T'  E'-TT'| F  E*F F  id

Eliminating indirect left recursion
i=F, j=E i=F, j=T S  E E  TE' E'+TE'| T  FT' T'  E'-TT'| F  TE'*F F  id S  E E  TE' E'+TE'| T  FT' T'  E'-TT'| F  FT'E'*F F  id S  E E  TE' E'+TE'| T  FT' T'  E'-TT'| F  idF' F'  T'E'*FF'|

Example Eliminate Left Recursion from the following grammar: S Aa | b
A Ac | Sd The algorithm is guaranteed to work if the grammar has no cycle and null productions

Example Left Recursion Elim.
A  B C | a B  C A | A b C  A B | C C | a Choose arrangement: A, B, C i = 1: nothing to do i = 2, j = 1: B  C A | A b  B  C A | B C b | a b (imm) B  C A BR | a b BR BR  C b BR |  i = 3, j = 1: C  A B | C C | a  C  B C B | a B | C C | a i = 3, j = 2: C  B C B | a B | C C | a  C  C A BR C B | a b BR C B | a B | C C | a (imm) C  a b BR C B CR | a B CR | a CR CR  A BR C B CR | C CR | 

Grammar problems Consider S  if E then S else S | if E then S
Which of the two productions should we use to expand non-terminal S when the next token is if? We can solve this problem by factoring out the common part in these rules. This way, we are postponing the decision about which rule to choose until we have more information (namely, whether there is an else or not). This is called left factoring

Left factoring A  1 | 2 |...| n |  becomes A  B| 
B  1 | 2 |...| n

Example Left factor the following grammar: S iEtS | iEtSeS |a E b

Grammar problems A symbol XV is useless if
there is no derivation from X to any string in the language (non-terminating) there is no derivation from S that reaches a sentential form containing X (non-reachable) Reduced grammar = a grammar that does not contain any useless symbols.

Useless symbols In order to remove useless symbols, apply two algorithms: First, remove all non-terminating symbols Then, remove all non-reachable symbols. The order is important! For example, consider S + X where  contains a non-terminating symbol. What will happen if we apply the algorithms in the wrong order? Concrete example: S  AB | a, A a

Useless symbols Example Initial grammar:
S AB | CA A a B CB | AB C cB | b D aD | d Algorithm 1 (terminating symbols): A is in because of A a C is in because of C b D is in because of D d S is in because A, C are in and S AC

Useless symbols Example continued After algorithm 1:
S CA A a C b D aD | d Algorithm 2 (reachable symbols): S is in because it is the start symbol C and A are in because S is in and S CA Final grammar: S CA A a C b

Parsing Parsing = process of determining if a string of tokens can be generated by a grammar For any CF grammar there is a parser that takes at most O(n3) time to parse a string of n tokens Linear algorithms suffice for parsing programming language source code Top-down parsing “constructs” a parse tree from root to leaves Bottom-up parsing “constructs” a parse tree from leaves to root

Predictive Parsing Recursive descent parsing is a top-down parsing method Every nonterminal has one (recursive) procedure responsible for parsing the nonterminal’s syntactic category of input tokens When a nonterminal has multiple productions, each production is implemented in a branch of a selection statement based on input look-ahead information Predictive parsing is a special form of recursive descent parsing where we use one lookahead token to unambiguously determine the parse operations