Presentation on theme: "SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free grammar is one which does not occur in the derivation of any sentence of that grammar."— Presentation transcript:
SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free grammar is one which does not occur in the derivation of any sentence of that grammar. For example:G→ RT R→ Ra T→b HereR is useless.
Clearly a symbol is useless if and only if either: a)we cannot derive any string containing it from the goal symbol and/or b)we cannot derive a terminal string from that symbol Notation: a) is expressed by saying that the symbol is not reachable from the goal symbol. b) is expressed by saying that the symbol does not derive a terminal string
Algorithm: To find all those symbols that are not reachable from the goal symbol. 1) Make a list of all the grammar symbols, all initially unflagged. 2) Flag the goal symbol. 3) Go through the grammar from the 1 st production to the last. If A→ x 1 x 2 …x n is one of these productions and A is flagged, then flag x 1,x 2,…,x n (those ones not already flagged). 4) Where any new symbols flagged during the iteration of step 3? If so, repeat step 3 again, otherwise stop. Any symbol that has not been flagged at this stage is not reachable from the goal symbol.
EXAMPLE Grammar 1 z → b e a → a e | e b → c e | a f c → c f d → f D is not reachable f √ f c √ c d is not reachable
Grammar 2 Z → E + T E → E | S + F | T F → F | F P | P P → G G → G | G G | F T→ T * i | i Q → E | E + F | T | S S → i Q is not reachable Q is not reachable
Grammar 3 G → A Q → P R P→Q Q, P, R are not reachable
Algorithm: To determine which symbols do not derive a terminal string. 1) Make a fresh list of all the symbols, initially unflagged. 2) Flag all the terminals. 3) Go through the grammar from the first production to the last. If A→ x 1 x 2 …x n is any such production, then if x 1, x 2, …,x n are all flagged, flag A. 4) Were any new symbols flagged in step 3? If so, go back to step 3. If not, all symbols not flagged at this stage do not derive a terminal string.
Definitions: 1) A α means you can derive α from A or α=A 2) A symbol A is said to vanish if A ε 3) A production of the form χ→ε is called an ε-production
Algorithm: To determine which symbols of a grammar vanish. 1) Make a list of symbols, initially unflagged. 2) Flag all the left hand sides of ε-productions. 3) Go through the grammar from 1 st production to last. If A→ x 1 x 2 …x n is any such production, then if x 1, x 2, …,x n are all flagged, then flag A. 4)Were any new symbols flagged in step 3? If so, go back to step 3, else stop. The flagged symbols are those which vanish.
Example: Try the algorithm on the following Grammar. Grammar G4 A → b Y D | A Y c Y → E F | ε D → g h i F → N O | Y N N → ε O → Y N E → Y O N Y
Defns: An - production is one of the form A -> . If A, in this case, is the goal symbol, the production is referred to as a null goal production Theorem: For every cfg G, there exists a cfg G’, such that L(G’) = L(G), and G’ has no -productions with exception that if L(G), then G’ contains a null goal production.
Proof. G’ can be formed from G as follows: 1. Discard all the -productions. 2. For each production of G, add to the grammar all possible productions that can be formed from it by omitting from its rhs some subset of those symbols (if any) that vanish.. 3. Remove all productions with useless symbols. 4. If the goal symbol of G vanishes, add a null goal production.
Example G -> AVw A -> aA | a V -> rUcW | U -> W -> First of all, determine which symbols vanish: U, V, W.
1) Remove -productions, gives: G -> AVw A -> aA | a V -> rUcW 2) Considering G -> AVw in step 2 of the algorithm, we add to the grammar G -> Aw Considering V -> rUcW, we add V -> rc V -> rUc V -> rcW 3) W, U are now useless symbols, so leaving out all productions with W, U, we get: G -> AVw | Aw A -> aA | a V -> rc
Exercise. Apply the above theorem to eliminate - productions from: V -> Bc B -> RT | NM N -> M -> KK K ->