Presentation is loading. Please wait.

# 1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction.

## Presentation on theme: "1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction."— Presentation transcript:

1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction

2 uA construct used to establish properties of context-free languages (CFLs)  Every CFL without  can be generated by a CFG in Chomsky normal form.  To show that language without  is a CFL it is sufficient to show that it has a CFG in Chomsky normal form. uTypical approach to closure properites Chomsky Normal Form: Purpose

3 Chomsky Normal Form: Definition A context free grammar (CFG) in which all production are of the form A->BC or A->a, where A, B and C are variables and a is a terminal

4 uEliminate “useless: symbols  Variables or terminals that do not appear in any derivation of a terminal string from the start symbol  Eliminate  -productions  A->  uEliminate unit-productions wA->B for variables A and B Chomsky Normal Form: method of construction

5 uFor each elimination task, a method will be defined reclusively by an inductive proof. uOrder in which tasks are preformed is important Chomsky Normal Form: method of construction - 2

6 Generating and Reachable Symbols uX is generating if X =>* w (terminal string) uIf X is a terminal, then it can generate itself in zero steps.  X is reachable if S =>*  X  for some  and , (S is a start symbol) uAny symbol that is not generating and reachable is useless

7 Induction to find generating variables uBasis: If there is a production A -> w, where w is a terminal string, then A is generating. uInduction: If there is a production A -> , where  consists only of terminals and variables known to derive a terminal string, then A derives a terminal string; hence is generating.

8 Algorithm to eliminate non- generating variables 1.Discover all variables that derive terminal strings. 2.For all other variables, remove all productions in which they appear either on the LHS or RHS of ->.

9 Example: finding generating variables S->AB|C, A->aA|a, B->bB, C->c uBasis: A and C are generating due to productions A->a and C->c. uInduction: S is generating due to production S->C. uEliminate B->bB and S->AB uResult: S->C, A->aA|a, C->c uStill have unreachable variables

10 Finding reachable symbols uBasis: Obviously, start symbol is reachable. uInduction: if we can reach A, and there is a production A-> , then we can reach all symbols of . uIn result from previous slide wS->C, A->aA|a, C->c uOnly S and C are reachable

11 Epsilon Productions  Theorem: If L is a CFL with no empty string, then it has a CFG which can be put in Chomsky form with no  -productions.  A->  is clearly an  -production  To eliminate all types  -productions, we must first discover the nullable variables,  i.e. variables A such that A =>* ε.

12 Inductive definition of nullable symbols  Basis: If there is a production A -> ε, then A is nullable. uInduction: If there is a production A -> , and all symbols of  are nullable, then A is nullable.

13 Example: Nullable Symbols S->AB, A->aA| ε, B->bB|A  A is nullable because of A -> ε. uB is nullable because of B -> A. uS is nullable because of S -> AB.

14 Algorithm to eliminate  -productions uIdentify all nullable symbols. uConsider each production A->X 1 …X n that contains nullable symbols uSuppose A->X 1 …X n contains m { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/14/4238932/slides/slide_14.jpg", "name": "14 Algorithm to eliminate  -productions uIdentify all nullable symbols.", "description": "uConsider each production A->X 1 …X n that contains nullable symbols uSuppose A->X 1 …X n contains m

15 Eliminating  -productions  The new CFG with no  -productions consist of all families of productions derived from productions with nullable symbols uPlus all productions from the original CFG that did not contain nullable symbols

16 Example: Eliminating ε -Productions S->ABC, A->aA| ε, B->bB| ε, C-> ε uA, B, C, and S are all nullable. uProductions S->ABC|AB|AC|BC|A|B|C come from S->ABC uProductions A->aA|a come from A->aA uProductions B->bB|b come from B->bB

17 Eliminating ε -Productions continued S->ABC, A->aA| ε, B->bB| ε, C-> ε uNo contribution to CNF from original CFG uC is not generating uEliminate C in productions of the new CFG S -> ABC | AB | AC | BC | A | B | C A -> aA | a B -> bB | b

18 Define Unit Productions uA unit production is a production whose right side consists of exactly one variable. uA->a is not a unit production if a is terminal uEliminate by expansion is most common approach

19 Eliminate by expansion uIn the CFG defined by wE->T|E+T wT->F|T*F wF->I|(E) wI->a|Ia uE->T eliminated by E->F|T*F|E+T uE->F eliminated by E->I|(E)|T*F|E+T uE->I eliminated by E->a|Ia|(E)|T*F|E+T

20 Eliminate by expansion uWill not work on cycles of unit productions wA->B wB->C wC->A uAlternative: find all pairs (A,B) such that A=>*B by a sequence of unit productions u Works in all cases.

21 Alternative to expansion in eliminating unit productions uBasic idea: If A=>*B by a series of unit productions, and B->  is a non- unit-production, then add production A->  and drop the unit productions. uExample

22 Example of basic idea uIn the CFG defined by wE->T|E+T wT->F|T*F wF->I|(E) wI->a|Ia uE=>*I by the series of unit productions E->T, T->F, F->I uI->a is a non-unit production. uReplace by E->a uE->a|Ia|(E)|T*F|E+T (same as expansion method)

23 Pair search defined by induction uFind all pairs (A,B) such that A=>*B by a sequence of unit productions only. uBasis: A=>*A, therefor (A,A). uInduction: If we have found (A,B), and B->C is a unit production, then add (A,C)

24 Example of pair search uIn CFG defined by wE->T|E+T wT->F|T*F wF->I|(E) wI->a|Ia uObviously (E,T), (T,F), (F,I) u(T,I) and (E,F) also

25 Cleaning up a Grammar  Theorem: if L is a CFL, then there is a CFG for L – { ε } that has: 1.No useless symbols. 2.No ε -productions. 3.No unit productions. uevery right side of a production is either a single terminal or has length > 2.

26 Clean-up continued uProof: Start with a CFG for L. uPerform the following steps in order: 1.Eliminate ε -productions. 2.Eliminate unit productions. 3.Eliminate variables that derive no terminal string. 4.Eliminate variables not reached from the start symbol. Must be first. Can create unit productions and useless variables.

27 Chomsky Normal Form uA CFG is said to be in Chomsky Normal Form if every production is of one of these two forms: 1.A -> BC (right side is two variables). 2.A -> a (right side is a single terminal).  Theorem: If L is a CFL, then L – { ε } has a CFG in CNF.

28 Proof by construction uStep 1: “Clean” the grammar, so every production has right side either a single terminal or length >2. uStep 2: For each right side  a single terminal, make the right side all variables. wFor each terminal a create new variable A a and production A a -> a. (not a unit production) wReplace a by A a in right sides of productions.

29 Example: Step 2 uConsider production A -> BcDe. uWe need variables A c and A e. with productions A c -> c and A e -> e. wNote: you create at most one variable for each terminal, and use it everywhere it is needed. uReplace A -> BcDe by A -> BA c DA e.

30 CNF construction: final step uStep 3: Break right sides longer than 2 into a chain of productions with right sides of two variables. uExample: A -> BCDE is replaced by A -> BF, F -> CG, and G -> DE. wF and G must be used nowhere else.

31 Example text p266 S->AB A->aAA|  B->bBB| 

32 Assignment 11, Due 11-19-14 Exercise 7.1.2 text p 275 and 277

33

Download ppt "1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction."

Similar presentations

Ads by Google