Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.

Slides:



Advertisements
Similar presentations
The Pumping Lemma for CFL’s
Advertisements

1 Lecture 32 Closure Properties for CFL’s –Kleene Closure construction examples proof of correctness –Others covered less thoroughly in lecture union,
Closure Properties of CFL's
Context-Free Grammars
101 The Cocke-Kasami-Younger Algorithm An example of bottom-up parsing, for CFG in Chomsky normal form G :S  AB | BB A  CC | AB | a B  BB | CA | b C.
Simplifying CFGs There are several ways in which context-free grammars can be simplified. One natural way is to eliminate useless symbols those that cannot.
About Grammars CS 130 Theory of Computation HMU Textbook: Sec 7.1, 6.3, 5.4.
1 Chomsky Normal Form of CFG’s Definition Purpose Method of Constuction.
Fall 2004COMP 3351 Simplifications of Context-Free Grammars.
Prof. Busch - LSU1 Simplifications of Context-Free Grammars.
SIMPLIFYING GRAMMARS Definition: A useless symbol of a context-free
CS 3240: Languages and Computation Properties of Context-Free Languages.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Chapter 4 Normal Forms for CFGs Chomsky Normal Form n Defn A CFG G = (V, , P, S) is in chomsky normal form if each rule in G has one of.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Lecture 15UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 15.
CS5371 Theory of Computation
Lecture Note of 12/22 jinnjy. Outline Chomsky Normal Form and CYK Algorithm Pumping Lemma for Context-Free Languages Closure Properties of CFL.
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)
1 Module 31 Closure Properties for CFL’s –Kleene Closure construction examples proof of correctness –Others covered less thoroughly in lecture union, concatenation.
Normal forms for Context-Free Grammars
1 Module 32 Chomsky Normal Form (CNF) –4 step process.
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
Today Chapter 2: (Pushdown automata) Non-CF languages CFL pumping lemma Closure properties of CFL.
CS5371 Theory of Computation Lecture 12: Computability III (Decidable Languages relating to DFA, NFA, and CFG)
CS 3813 Introduction to Formal Languages and Automata Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
1 Properties of Context-free Languages Reading: Chapter 7.
Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
CONVERTING TO CHOMSKY NORMAL FORM
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of CFL was to formalize.
Pushdown Automata (PDA) Intro
Context-Free Grammars Normal Forms Chapter 11. Normal Forms A normal form F for a set C of data objects is a form, i.e., a set of syntactically valid.
Normal Forms for Context-Free Grammars Definition: A symbol X in V  T is useless in a CFG G=(V, T, P, S) if there does not exist a derivation of the form.
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Context-Free Grammars – Chomsky Normal Form Lecture 16 Section 2.1 Wed, Sep 26, 2007.
1 Module 31 Closure Properties for CFL’s –Kleene Closure –Union –Concatenation CFL’s versus regular languages –regular languages subset of CFL.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
Chapter 6 Simplification of Context-free Grammars and Normal Forms These class notes are based on material from our textbook, An Introduction to Formal.
Context-Free and Noncontext-Free Languages Chapter 13 1.
1 Simplification of Context-Free Grammars Some useful substitution rules. Removing useless productions. Removing -productions. Removing unit-productions.
Non-CF Languages The language L = { a n b n c n | n  0 } does not appear to be context-free. Informal: A PDA can compare #a’s with #b’s. But by the time.
Context-Free and Noncontext-Free Languages Chapter 13.
Pumping Lemma for CFLs. Theorem 7.17: Let G be a CFG in CNF and w a string in L(G). Suppose we have a parse tree for w. If the length of the longest path.
1 Chapter 6 Simplification of CFGs and Normal Forms.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Transparency No. 1 Formal Language and Automata Theory Homework 5.
Exercises on Chomsky Normal Form and CYK parsing
About Grammars Hopcroft, Motawi, Ullman, Chap 7.1, 6.3, 5.4.
Complexity and Computability Theory I Lecture #12 Instructor: Rina Zviel-Girshin Lea Epstein.
Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon
Properties of Context-Free Languages
7. Properties of Context-Free Languages
Lecture 22 Pumping Lemma for Context Free Languages
Simplifications of Context-Free Grammars
Simplifications of Context-Free Grammars
Lecture 14 Grammars – Parse Trees– Normal Forms
Definition: Let G = (V, T, P, S) be a CFL
7. Properties of Context-Free Languages
Properties of Context-Free Languages
The Pumping Lemma for CFL’s
Context-Free Languages
Presentation transcript:

Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are G 1 =(V 1, ,R 1,S 1 ) and G 2 =(V 2, ,R 2,S 2 ). Construct a third grammar G 3 =(V 3, ,R 3,S 3 ) by: V 3 = V 1  V 2  { S 3 } (new start variable) with R 3 = R 1  R 2  { S 3  S 1 | S 2 }. It follows that L(G 3 ) = L(G 1 )  L(G 2 ).

Intersection, Complement? Let again A 1 and A 2 be two CF languages. One can prove that, in general, the intersection A 1  A 2, and the complement Ā 1 =  * \ A 1 are not context free languages.

Intersection, Complement? Proof for complement: The language L = { x#y | x, y are in {a, b}*, x != y} IS context-free. Complement of this language is L’ = { w | w has no # symbol} U { w | w has two or more # symbols} U { w#w | w is in {a,b}* }. We can show that L’ is NOT context-free.

Context-free languages are NOT closed under intersection Proof by counterexample: Recall that in an earlier slide in this lecture, we showed that L = {a n b n c n | n >= 0} is NOT context-free. Let A = {a n b n c m | n, m >= 0} and B = L = {a n b m c m | n, m >= 0}. It is easy to see that both A and B are context-free. (Design CFG’s.) This shows that CFG’s are not closed under intersection.

Intersection with regular languages If L is a CFL and R is a regular language, then L R is context-free.

Proof of Theorem 7.27

8 Normal Forms for CFG’s Eliminating Useless Variables Removing Epsilon Removing Unit Productions Chomsky Normal Form

9 Variables That Derive Nothing  Consider: S -> AB, A -> aA | a, B -> AB  Although A derives all strings of a’s, B derives no terminal strings (can you prove this fact?).  Thus, S derives nothing, and the language is empty.

10 Testing Whether a Variable Derives Some Terminal String Basis: If there is a production A -> w, where w has no variables, then A derives a terminal string. Induction: If there is a production A -> , where  consists only of terminals and variables known to derive a terminal string, then A derives a terminal string.

11 Testing – (2) Eventually, we can find no more variables. An easy induction on the order in which variables are discovered shows that each one truly derives a terminal string. Conversely, any variable that derives a terminal string will be discovered by this algorithm.

12 Proof of Converse The proof is an induction on the height of the least-height parse tree by which a variable A derives a terminal string. Basis: Height = 1. Tree looks like: Then the basis of the algorithm tells us that A will be discovered. A a1a1 anan...

13 Induction for Converse Assume IH for parse trees of height < h, and suppose A derives a terminal string via a parse tree of height h: By IH, those X i ’s that are variables are discovered. Thus, A will also be discovered, because it has a right side of terminals and/or discovered variables. A X1X1 XnXn... w1w1 wnwn

14 Algorithm to Eliminate Variables That Derive Nothing 1.Discover all variables that derive terminal strings. 2.For all other variables, remove all productions in which they appear either on the left or the right.

15 Example: Eliminate Variables S -> AB | C, A -> aA | a, B -> bB, C -> c Basis: A and C are identified because of A -> a and C -> c. Induction: S is identified because of S -> C. Nothing else can be identified. Result: S -> C, A -> aA | a, C -> c

16 Unreachable Symbols Another way a terminal or variable deserves to be eliminated is if it cannot appear in any derivation from the start symbol. Basis: We can reach S (the start symbol). Induction: if we can reach A, and there is a production A -> , then we can reach all symbols of .

17 Unreachable Symbols – (2) Easy inductions in both directions show that when we can discover no more symbols, then we have all and only the symbols that appear in derivations from S. Algorithm: Remove from the grammar all symbols not discovered reachable from S and all productions that involve these symbols.

18 Eliminating Useless Symbols A symbol is useful if it appears in some derivation of some terminal string from the start symbol. Otherwise, it is useless. Eliminate all useless symbols by: 1.Eliminate symbols that derive no terminal string. 2.Eliminate unreachable symbols.

19 Example: Useless Symbols – (2) S -> AB, A -> C, C -> c, B -> bB If we eliminated unreachable symbols first, we would find everything is reachable. A, C, and c would never get eliminated.

20 Why It Works After step (1), every symbol remaining derives some terminal string. After step (2) the only symbols remaining are all derivable from S. In addition, they still derive a terminal string, because such a derivation can only involve symbols reachable from S.

21 Epsilon Productions We can almost avoid using productions of the form A -> ε (called ε -productions ). –The problem is that ε cannot be in the language of any grammar that has no ε –productions. Theorem: If L is a CFL, then L-{ ε } has a CFG with no ε -productions.

22 Nullable Symbols To eliminate ε -productions, we first need to discover the nullable variables = variables A such that A =>* ε. Basis: If there is a production A -> ε, then A is nullable. Induction: If there is a production A -> , and all symbols of  are nullable, then A is nullable.

23 Example: Nullable Symbols S -> AB, A -> aA | ε, B -> bB | A Basis: A is nullable because of A -> ε. Induction: B is nullable because of B -> A. Then, S is nullable because of S -> AB.

24 Eliminating ε -Productions Key idea: turn each production A -> X 1 …X n into a family of productions. For each subset of nullable X’s, there is one production with those eliminated from the right side “in advance.” –Except, if all X’s are nullable, do not make a production with ε as the right side.

Example: Eliminating ε -Productions S -> ABC, A -> aA | ε, B -> bB | ε, C -> ε A, B, C, and S are all nullable. New grammar: S -> ABC | AB | AC | BC | A | B | C A -> aA | a B -> bB | b Note: C is now useless. Eliminate its productions.

26 Why it Works Prove that for all variables A: 1.If w  ε and A =>* old w, then A =>* new w. 2.If A =>* new w then w  ε and A =>* old w. Then, letting A be the start symbol proves that L(new) = L(old) – { ε }. (1) is an induction on the number of steps by which A derives w in the old grammar.

27 Proof of 1 – Basis If the old derivation is one step, then A -> w must be a production. Since w  ε, this production also appears in the new grammar. Thus, A => new w.

28 Proof of 1 – Induction Let A =>* old w be an n-step derivation, and assume the IH for derivations of less than n steps. Let the first step be A => old X 1 …X n. Then w can be broken into w = w 1 …w n, where X i =>* old w i, for all i, in fewer than n steps.

29 Induction – Continued By the IH, if w i  ε, then X i =>* new w i. Also, the new grammar has a production with A on the left, and just those X i ’s on the right such that w i  ε. –Note: they all can’t be ε, because w  ε. Follow a use of this production by the derivations X i =>* new w i to show that A derives w in the new grammar.

30 Proof of Converse We also need to show part (2) – if w is derived from A in the new grammar, then it is also derived in the old. Induction on number of steps in the derivation. We’ll leave the proof for reading in the text.

31 Unit Productions A unit production is one whose right side consists of exactly one variable. These productions can be eliminated. Key idea: If A =>* B by a series of unit productions, and B ->  is a non-unit- production, then add production A -> . Then, drop all unit productions.

32 Unit Productions – (2) Find all pairs (A, B) such that A =>* B by a sequence of unit productions only. Basis: Surely (A, A). Induction: If we have found (A, B), and B - > C is a unit production, then add (A, C).

33 Cleaning Up a Grammar Theorem: if L is a CFL, then there is a CFG for L – { ε } that has: 1.No useless symbols. 2.No ε -productions. 3.No unit productions. I.e., every right side is either a single terminal or has length > 2.

34 Cleaning Up – (2) Proof: Start with a CFG for L. Perform the following steps in order: 1.Eliminate ε -productions. 2.Eliminate unit productions. 3.Eliminate variables that derive no terminal string. 4.Eliminate variables not reached from the start symbol. Must be first. Can create unit productions or useless variables.

35 Chomsky Normal Form A CFG is said to be in Chomsky Normal Form if every production is of one of these two forms: 1.A -> BC (right side is two variables). 2.A -> a (right side is a single terminal). Theorem: If L is a CFL, then L – { ε } has a CFG in CNF.

36 Proof of CNF Theorem Step 1: “Clean” the grammar, so every production right side is either a single terminal or of length at least 2. Step 2: For each right side  a single terminal, make the right side all variables. –For each terminal a create new variable A a and production A a -> a. –Replace a by A a in right sides of length > 2.

37 Example: Step 2 Consider production A -> BcDe. We need variables A c and A e. with productions A c -> c and A e -> e. –Note: you create at most one variable for each terminal, and use it everywhere it is needed. Replace A -> BcDe by A -> BA c DA e.

38 CNF Proof – Continued Step 3: Break right sides longer than 2 into a chain of productions with right sides of two variables. Example: A -> BCDE is replaced by A -> BF, F -> CG, and G -> DE. –F and G must be used nowhere else.

39 Example of Step 3 – Continued Recall A -> BCDE is replaced by A -> BF, F -> CG, and G -> DE. In the new grammar, A => BF => BCG => BCDE. More importantly: Once we choose to replace A by BF, we must continue to BCG and BCDE. –Because F and G have only one production.

40 CNF Proof – Concluded We must prove that Steps 2 and 3 produce new grammars whose languages are the same as the previous grammar. Proofs are of a familiar type and involve inductions on the lengths of derivations.

CKY algorithm for recognizing CFL

Decision problems for CFL’s 1)Membership problem: Input: A CFG G and a string w Output: ‘yes’ if w is in L(G), ‘no’ else. (When the answer is yes, we may also want the output to be a parse tree for string w.) Cocke-Kasami-Younger algorithm presented is an algorithm for membership problem. (Its time complexity = O(n 3 ).)

Decision problems for CFL’s 2) Emptiness problem: Input: A CFG G Output: ‘yes’ if L(G) is empty, ‘no’ else. When discussing Chomsky normal form, one of the first steps was remove all the useful rules in a grammar. If any rules are left after this step, we know that L(G) is not empty. Thus emptiness problem is also decidable.