Presentation on theme: "SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free"— Presentation transcript:
1 SIMPLIFYING GRAMMARSDefinition: A useless symbol of a context-freegrammar is one which does not occur in thederivation of any sentence of that grammar.For example: G→ RTR→ RaT→bHere R is useless.
2 Clearly a symbol is useless if and only if either: we cannot derive any string containing it fromthe goal symbol and/orwe cannot derive a terminal string from thatsymbolNotation: a) is expressed by saying that the symbolis not reachable from the goal symbol. b) isexpressed by saying that the symbol does notderive a terminal string
3 Algorithm: To find all those symbols that are not reachable from the goal symbol. 1) Make a list of all the grammar symbols, all initiallyunflagged.2) Flag the goal symbol.3) Go through the grammar from the 1st production to the last.If A→ x1x2…xn is one of these productions and A isflagged, then flag x1,x2,…,xn (those ones not alreadyflagged).4) Where any new symbols flagged during the iteration ofstep 3? If so, repeat step 3 again, otherwise stop. Anysymbol that has not been flagged at this stage is notreachable from the goal symbol.
4 EXAMPLEGrammar 1z → b ea → a e | eb → c e | a fc → c fd → fd is not reachable f√cd is not reachable
5 Grammar 2Z → E + TE → E | S + F | TF → F | F P | PP → GG → G | G G | FT→ T * i | iQ → E | E + F | T | SS → iQ is not reachable
6 Grammar 3 G → A Q → P R P→Q Q, P, R are not reachable G → AQ → P RP→QQ, P, R are not reachable
7 Algorithm: To determine which symbols do not derive a terminal string.1) Make a fresh list of all the symbols, initiallyunflagged.2) Flag all the terminals.3) Go through the grammar from the first productionto the last. If A→ x1x2…xn is any such production,then if x1,x2,…,xn are all flagged, flag A.4) Were any new symbols flagged in step 3? If so, goback to step 3. If not, all symbols not flagged atthis stage do not derive a terminal string.
9 Definitions:1) A α means you can derive α from A or α=A2) A symbol A is said to vanish if A ε3) A production of the form χ→ε is calledan ε-productionNote that the textbook uses λ to denote the emptystring, whereas these slides employ ε for this purpose
10 Algorithm: To determine which symbols of a grammar vanish.1) Make a list of symbols, initially unflagged.2) Flag all the left hand sides of ε-productions.3) Go through the grammar from 1st production tolast. If A→ x1x2…xn is any such production, thenif x1,x2,…,xn are all flagged, then flag A.Were any new symbols flagged in step 3? If so,go back to step 3, else stop. The flagged symbolsare those which vanish.
11 Example: Try the algorithm on the following Grammar.Grammar G4A → b Y D | A Y cY → E F | εD → g h iF → N O | Y NN → εO → Y NE → Y O N Y
12 Defns:An - production is one of the form A -> .If A, in this case, is the goal symbol, the productionis referred to as a null goal productionTheorem:For every cfg G, there exists a cfg G’, such that L(G’) = L(G), and G’ has no -productionswith exception that if L(G), then G’ contains a null goal production.
13 Proof. G’ can be formed from G as follows: 1. Discard all the -productions.2. For each production of G, add to the grammarall possible productions that can be formedfrom it by omitting from its rhs some subsetof those symbols (if any) that vanish..3. Remove all productions with useless symbols.4. If the goal symbol of G vanishes, add a null goalproduction.
14 Example 1G -> AVwA -> aA | aV -> rUcW | U -> W -> First of all, determine which symbols vanish: U, V, W.
15 1) Remove -productions, gives: G -> AVw A -> aA | a V -> rUcW2) Considering G -> AVw in step 2 of thealgorithm, we add to the grammar G -> AwConsidering V -> rUcW, we addV -> rc V -> rUc V -> rcW3) W, U are now useless symbols, so leavingout all productions with W, U, we get:G -> AVw | Aw A -> aA | a V -> rc
16 EXAMPLE. Provide a grammar equivalent to the one below but without ε-productionsS → ABaCA → BCB → b | εC → D | εD → dTry working this out for yourself, before consultingthe answer on the next slide. Note carefully that thesymbol A is one of those that vanishes.
17 ANSWERS → ABaC | ABa | AaC | Aa | BaC |Ba | aC | aA → BC | B | CB → bC → DD → d
18 Defn. A unit production of a grammar is one of the form A -> B where A, B are both non-terminals. Theorem. For any context-free grammar G, a cfg G’ s.t. L(G’) = L(G) and G’ does not contain any unit productions.
19 Proof. G’ can be formed from G as follows Eliminate -productions from G to form G*(with possibly a null goal symbol)If A is the left hand side of a unit production and B is any symbol that can be derivedfrom A, and B -> is any production with B as left hand side where is not a single non-terminal, then add to grammar A -> .By step 1, any derivation of B from A mustconsist entirely of a sequence of non-terminals.Do step 2 for all symbols which are the left hand side of a unit production
20 To find all single symbols that can be derived from a symbol A, consider the derivation tree in which no symbol occurs more than once, e.g.: A B D E C F N M If say M B, we do not include it, as B already occurs in the tree. Hence the depth of the treeis < = the number of unit productions
22 EXAMPLE Consider the grammar: E → E + T | T T → T * F | F F → ( E ) | aSince E => T and T → T * F,we add to the grammar E → T * Fand since E => F and F → ( E ) | a,we add E → ( E ) and E → aAlso since T => F, we add T → ( E ) | a
23 Discarding all unit productions, then gives us: E → E + T | T * F | ( E ) | aT → T * F | ( E ) | aF → ( E ) | a
24 EXAMPLE 3. “Remove” unit productions from: S → Aa | BB → A | bbA → a | bc | BANSWERS → Aa | bb | a | bc since S => B and S => AB → bb | a | bc since B => AA → a | bc | bb since A => BBut B is a useless symbol, so discard theproduction involving B
25 EXAMPLE 4. “Remove” unit productions from S → Aa | bb | a | bc | BB → bb | a | bcA → a | bc | bb
26 ANSWERS → Aa | bb | bc | aB → bb | a | bcA → a | bc | bbAgain, B is a useless symbol, and so theproductions involving it should be discarded
27 Defn. A nice context free grammar is one: a) without useless symbols,b) without -production except possible for a nullgoal production, andc) without unit productionsNotation. cfg stands for context free grammar,and ncfg stands for nice context free grammarCorollary. For every cfg G, a ncfg G’,such that L(G’) = L(G).