CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.

Slides:



Advertisements
Similar presentations
Context-Free and Noncontext-Free Languages
Advertisements

The Pumping Lemma for CFL’s
Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
1 Lecture 32 Closure Properties for CFL’s –Kleene Closure construction examples proof of correctness –Others covered less thoroughly in lecture union,
Closure Properties of CFL's
Nathan Brunelle Department of Computer Science University of Virginia Theory of Computation CS3102 – Spring 2014 A tale.
CS 3240: Languages and Computation Properties of Context-Free Languages.
CS21 Decidability and Tractability
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Lecture 15UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 15.
CS5371 Theory of Computation
Chap 2 Context-Free Languages. Context-free Grammars is not regular Context-free grammar : eg. G 1 : A  0A1substitution rules A  Bproduction rules B.
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, 2007
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 4 Updated by Marek Perkowski.
104 Closure Properties of Regular Languages Regular languages are closed under many set operations. Let L 1 and L 2 be regular languages. (1) L 1  L 2.
CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)
1 Module 31 Closure Properties for CFL’s –Kleene Closure construction examples proof of correctness –Others covered less thoroughly in lecture union, concatenation.
Normal forms for Context-Free Grammars
Context Free Pumping Lemma Zeph Grunschlag. Agenda Context Free Pumping Motivation Theorem Proof Proving non-Context Freeness Examples on slides Examples.
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.3: Non-Context-Free Languages) David Martin With.
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
CS 3240 – Chapter 8.  Is a n b n c n context-free? CS Properties of Context-Free Languages2.
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Today Chapter 2: (Pushdown automata) Non-CF languages CFL pumping lemma Closure properties of CFL.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
INHERENT LIMITATIONS OF COMPUTER PROGAMS CSci 4011.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Chapter 4 Context-Free Languages Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 1.
Context-free Grammars Example : S   Shortened notation : S  aSaS   | aSa | bSb S  bSb Which strings can be generated from S ? [Section 6.1]
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
1 Properties of Context-Free Languages Is a certain language context-free? Is the family of CFLs closed under a certain operation?
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
CSCI 2670 Introduction to Theory of Computing September 22, 2005.
Context-Free and Noncontext-Free Languages Chapter 13 1.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
Context-Free and Noncontext-Free Languages Chapter 13 1.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
Properties of Regular Languages
Non-CF Languages The language L = { a n b n c n | n  0 } does not appear to be context-free. Informal: A PDA can compare #a’s with #b’s. But by the time.
CS 203: Introduction to Formal Languages and Automata
Context-Free and Noncontext-Free Languages Chapter 13.
Pumping Lemma for CFLs. Theorem 7.17: Let G be a CFG in CNF and w a string in L(G). Suppose we have a parse tree for w. If the length of the longest path.
CSCI 2670 Introduction to Theory of Computing September 23, 2004.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
January 20, 2016CS21 Lecture 71 CS21 Decidability and Tractability Lecture 7 January 20, 2016.
Lecture # 31 Theory Of Automata By Dr. MM Alam 1.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
CS 154 Formal Languages and Computability March 17 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron Mak.
 2004 SDU Lecture8 NON-Context-free languages.  2004 SDU 2 Are all languages context free? Ans: No. # of PDAs on  < # of languages on  Pumping lemma:
Bottom-up parsing Pumping Theorem for CFLs MA/CSSE 474 Theory of Computation.
Complexity and Computability Theory I Lecture #12 Instructor: Rina Zviel-Girshin Lea Epstein.
CSE 105 Theory of Computation Alexander Tsiatas Spring 2012 Theory of Computation Lecture Slides by Alexander Tsiatas is licensed under a Creative Commons.
Week 14 - Friday.  What did we talk about last time?  Simplifying FSAs  Quotient automata.
Context-Free and Noncontext-Free Languages Chapter 13.
CSE 3813 Introduction to Formal Languages and Automata
Chapter Fourteen: The Context-Free Frontier
Definition: Let G = (V, T, P, S) be a CFL
فصل دوم Context-Free Languages
COSC 3340: Introduction to Theory of Computation
Properties of Context-Free Languages
Chapter 2 Context-Free Language - 01
COSC 3340: Introduction to Theory of Computation
CS21 Decidability and Tractability
Pumping Theorem for CFLs
Presentation transcript:

CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata, 4th ed., by Peter Linz, published by Jones and Bartlett Publishers, Inc., Sudbury, MA, They are intended for classroom use only and are not a substitute for reading the textbook.

The pumping lemma for context-free languages Suppose you have a CFG G in which the variable A is used in two different rules, to derive two different strings, e.g., (1)S  vAz (2)A  wAy (3)A  x We can use these rules, applying rule 2 recursively, to generate the following string: S  vAz  vwAyz  vwwAyyz  vwwwAyyyz ...  vw n xy n z.

The pumping lemma for CFLs Of course, we can apply rule 3 at any point along the way to bring the process to a halt. Thus, the following strings are all legitimate strings in the language: vwxyz, vwwxyyz, vwwwxyyyz, etc. In fact, with rules 2 and 3 in the language, there is no way to prevent the language from containing an infinite number of strings of the form vw n xy n z.

The pumping lemma for CFLs Remember the definition of Chomsky Normal Form grammars: A CFG is in Chomsky Normal Form if every production is of one of these two types: A  BC A  a Remember also that we can put any CFG grammar into CNF (omitting the null string, if it belongs to the original language).

The pumping lemma for CFLs If a grammar is in CNF, then its derivation tree will be binary; that is, every node will have at most two children. Why? There are only 3 possibilities: (1) The node represents the first type of rule above, in which a single variable produces two variables. (2) The node represents the second type of rule above, in which a single variable produces a single terminal. (3) The node is a terminal node and so has no children.

The pumping lemma for CFLs A path in a binary tree is either empty, or consists of a node, one of its descendants, and all of the nodes in between. The length of a path is the number of nodes it contains (for this class, we will us this definition; however, most of the time length and height are in terms of the number of edges, not number of nodes). The height of a binary tree is the length of its longest path.

The pumping lemma for CFLs You could create a very tall binary tree by having all branches be unary. You can create the shortest possible binary tree by having all of its branches be binary, except possibly for some or all of the branches at the bottom level of the tree.

The pumping lemma for CFLs What is the smallest height possible in a binary tree of 7 nodes? How many leaf nodes does it have? height = 3 num. leaves = 4

The pumping lemma for CFLs What is the smallest height possible in a binary tree of 15 nodes? How many leaf nodes does it have? height = 4 num. leaves = 8

The pumping lemma for CFLs What is the smallest height possible in a binary tree of 31 nodes? How many leaf nodes does it have? height = 5 num. leaves = 16

The pumping lemma for CFLs What is the smallest height possible in a binary tree of (2 n ) - 1 nodes? How many leaf nodes does it have? height = n num. leaves = 2 n-1

The pumping lemma for CFLs Note the pattern here: In a completely filled binary tree with (2 n ) – 1 nodes, half of the nodes (rounding up) will be leaves. That is, (2 n ) / 2 nodes will be leaf nodes. And we can rewrite (2 n ) / 2 as 2 n-1. This leads us to the following lemma:

The pumping lemma for CFLs Lemma: For any h  1, a binary tree which has more than 2 h-1 leaf nodes must have a height greater than h. Example: If a binary tree has 17 leaf nodes, can it have a height of 5? No; a complete binary tree of height 5 has only 16 leaf nodes. A binary tree with 17 leaves must have a height greater than 5.

The pumping lemma for CFLs Here is the point of all this: If the height of the derivation tree for a given string in the language is h, and there are fewer than h production rules in the grammar, then at least one rule must recur on the same path in the derivation of this string.

The pumping lemma for CFLs For a variable to recur farther down in the same path, it must be either: self-recursive (e.g., A  aA) or path-recursive (e.g., A  aB, and B  bA ) In either case, this variable may be pumped an unrestricted number of times.

Theorem 8.1 Let L be a CFL. Then there is an integer m so that for any w  L satisfying |w|  m, there are strings u, v, x, y, and z satisfying w = uvxyz |vy| > 0 |vxy|  m for any i > 0, uv i xy i z  L

The pumping lemma for CFLs We can use the pumping lemma for context-free languages to prove that there must exist some language that is not context- free. We do this by assuming that the language is context free; this means that there must be an m satisfying the conditions given above. If we find that this causes a contradiction, then we know the language can’t be a CFL.

Proof Given the language L = {a i b i c i | i  1}, assume that L is context-free. Let w = a m b m c m, with |w|  m. According to theorem 8.1, |vy| > 0. Thus, v and y together must contain at least one type of symbol. According to theorem 8.1, |vxy|  m. Thus, the string vxy can contain at most two distinct types of symbols.

Proof The string vxy can’t contain all three symbols, a, b, and c. (Why? Because |vxy|  m.) The string uv 2 xy 2 z contains additional occurrences of the symbols in v and y. Therefore, uv 2 xy 2 z cannot contain equal numbers of all three symbols. But the pumping lemma says that uv 2 xy 2 z must be a legitimate string in L. Obviously, this is a contradiction. Consequently, L cannot be a context-free language.

Example Given the language L = {a i b i c i | i  1}, how would you try to process this language using a push-down automaton? We can insure that we have an equal number of a’s and b’s, by pushing the a’s onto the stack one at a time, then popping them off and matching them up with the b’s one by one.

Example However, once we have done that, we don’t have anything left to match the c’s with, so we can’t guarantee that we have the same number of c’s as a’s and b’s. We can’t solve this problem by pushing a’s or b’s back onto the stack. This is due to the limitations of the type of memory we have in a PDA.

Pumping lemma (again) The pumping lemma for regular languages states: every sufficiently long string in a regular language contains a short substring that can be pumped. The pumping lemma for context-free languages states: every sufficiently long string in a context-free language contains two short (and close-together) substrings that can be pumped (the same number of times).

Formal statement (again) Let L be a context-free language. Then there exists some positive integer m such that any string w  L of length |w|  m can be decomposed into substrings, u, v, x, y, z, such that w = uvxyz, and |vxy|  m, |v| > 0 or |y| > 0, uv k xy k z  L, for k  0

Informal statement Every context-free language has a “pumping length” such that every string in the language that is longer than this can be pumped to yield another string in the language. The string can be divided into five parts such that the second and fourth parts can be repeated together, or “pumped,” any number of times, and the resulting string remains in the language.

In the pumping lemma for regular languages, the “pumping length” m reflects the number of states of the finite automaton. In the pumping lemma for context-free languages, what does m reflect? Roughly, it is the length of the longest string that can be generated by a parse tree in which the same nonterminal never occurs twice on the same path through the tree. What is m?

In a sufficiently large parse tree, some nonterminal must repeat along some path from the root. This follows from the pigeonhole principle. S A A u v x y z

Proof Idea The repetition of some nonterminal along a path through the parse tree allows us to replace the subtree under the last occurrence of the nonterminal with the subtree under an earlier occurrence of the nonterminal and still get a valid parse tree This corresponds to pumping v and y Note that the parse tree of the previous slide corresponds to the following derivation:

Important to remember You can use a pumping lemma to prove that a language is not context-free (or regular). You cannot use a pumping lemma to prove that a language is context-free (or regular).

Exercise The language L = {ww | w  {a, b}*} is not context-free. Pick a string in L. Try a m b m a m b m. Then note that you must consider three cases. It must be the case that vxy is a substring of the prefix a m b m, or the “middle” b m a m, or the suffix a m b m. Intuitively, why can’t a PDA accept this language, although it can accept the language {ww R | w  {a, b}*}?

Pumping Lemma for Linear Languages Let L be an infinite linear language. Then there exists some positive integer m, such that any w  L, with |w|  m can be decomposed as w = uvxyz with |uvyz|  m |vy|  1 such that uv i xy i z  L for all i = 0,1,2…

Pumping Lemma for Linear Languages Note that the conclusion for this theorem is different from Theorem 8.1, since in 8.1 we have |vxy|  m and in Theorem 8.2 we have |uvyz|  m This implies that the strings v and y to be pumped must now be within m symbols of the left and right ends of w, respectively. The middle string x can be of arbitrary length. Theorem 8.2 helps establish the fact that the family of linear languages is a proper subset of the family of context-free languages.

Closure properties for context-free languages The family of context-free languages is closed under the operations of: Union Concatenation Kleene closure but not under the operations of Intersection Complementation

Definition A context-free grammar (CFG) is a 4-tuple G = (V, T, S, P) where V and T are disjoint sets, S  V, and P is a finite set of rules of the form A  x, where A  V and x  (V  T)*. V = non-terminals or variables T = terminals S = Start symbol P = Productions or grammar rules

Closure properties of CFGs CFLs are closed under Union, Concatenation and Kleene closure. Proof by construction: Let G 1 = (V 1, T 1, S 1, P 1 ) and G 2 = (V 2, T 2, S 2, P 2 ) with L 1 = L(G 1 ) and L 2 = L(G 2 )

Union We create grammar G u = (V u, T 1  T 2, S u, P u ) generating L 1  L 2 1. Rename the elements of V 2 if necessary so that V 1  V 2 = . 2. Create a new start symbol S u, not already in V 1 or V Set V u = V 1  V 2  {S u } 4. Set P u = P 1  P 2  {S u  S 1 | S 2 } Construction completed.

Concatenation We create grammar G c = (V c, T 1  T 2, S c, P c ) generating L 1 L 2 1. Rename the elements of V 2 if necessary so that V 1  V 2 = . 2. Create a new start symbol S c, not already in V 1 or V Set V c = V 1  V 2  {S c } 4. Set P c = P 1  P 2  {S c  S 1 S 2 } Construction completed.

Closure under Kleene star Let G 1 be any context-free grammar with the starting symbol S. Adding the rules S  λ and S  SS creates a new context-free grammar G 2 such that L(G 2 ) is the result of applying the Kleene star operator to L(G 1 ).

Kleene Closure We create grammar G* = (V, T, S, P) generating L 1 * 1. Create a new start symbol S, not already in V Set V* = V 1  {S} 3. Set P* = P 1  {S  S 1 S | l} Construction completed. (See text for justification.)

Not closed under intersection The context-free languages are not closed under Intersection. However, the intersection of a context-free language with a regular language is always a context-free language. The context-free languages are not closed under Complementation

Corollary: Are Regular Languages context free? Yes. Why? We can express any Regular language in the form of a CFG. Regular languages are a proper subset of CFGs.

Are Regular Languages context free? Proof: According to your textbook, the set of regular languages is the smallest set that contains all languages , { }, and {a} (for every a   ) and is closed under the operations of union, concatenation, and Kleene*. We just demonstrated that the operations of union, concatenation, and Kleene* on CFGs produce CFGs, so all we need to do is show that the languages , { }, and {a} have CFGs.

Are Regular Languages context free? The empty language can be written S  S The language consisting of a null string can be written S  The language consisting of single characters can be written S  a QED

Decision properties of context-free languages Can decide: Membership Empty Infinite But there is no algorithm for deciding whether two CFGs generate the same language!