Definition: Let G = (V, T, P, S) be a CFL

Slides:



Advertisements
Similar presentations
The Pumping Lemma for CFL’s
Advertisements

Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
Closure Properties of CFL's
CS 3240: Languages and Computation Properties of Context-Free Languages.
Lecture 15UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 15.
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)
Normal forms for Context-Free Grammars
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.3: Non-Context-Free Languages) David Martin With.
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Prof. Busch - LSU1 Pumping Lemma for Context-free Languages.
Today Chapter 2: (Pushdown automata) Non-CF languages CFL pumping lemma Closure properties of CFL.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
INHERENT LIMITATIONS OF COMPUTER PROGAMS CSci 4011.
Chapter 12: Context-Free Languages and Pushdown Automata
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Context-Free and Noncontext-Free Languages Chapter 13 1.
1 L= { w c w R : w  {a, b}* } is accepted by the PDA below. Use a construction like the one for intersection for regular languages to design a PDA that.
Section 12.4 Context-Free Language Topics
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
Test # 2 Friday, November 2 nd Covers all of Chapter 2 Test # 2 will be 20% of final score (same as Test #1) Final Exam will be 40% of the final score.
Closure Properties Lemma: Let A 1 and A 2 be two CF languages, then the union A 1  A 2 is context free as well. Proof: Assume that the two grammars are.
Non-CF Languages The language L = { a n b n c n | n  0 } does not appear to be context-free. Informal: A PDA can compare #a’s with #b’s. But by the time.
Context Free Pumping Lemma. CFL Pumping Lemma A CFL pump consists of two non-overlapping substrings that can be pumped simultaneously while staying in.
Pumping Lemma for CFLs. Theorem 7.17: Let G be a CFG in CNF and w a string in L(G). Suppose we have a parse tree for w. If the length of the longest path.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 9 Mälardalen University 2006.
Costas Busch - LSU1 Pumping Lemma for Context-free Languages.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
 2004 SDU Lecture8 NON-Context-free languages.  2004 SDU 2 Are all languages context free? Ans: No. # of PDAs on  < # of languages on  Pumping lemma:
Bottom-up parsing Pumping Theorem for CFLs MA/CSSE 474 Theory of Computation.
Context-Free and Noncontext-Free Languages Chapter 13.
Closed book, closed notes
Lecture 15 Pumping Lemma.
7. Properties of Context-Free Languages
Lecture 22 Pumping Lemma for Context Free Languages
PDAs Accept Context-Free Languages
Context-Free Grammars Pushdown Store Automata Properties of the CFLs
Lecture 14 Grammars – Parse Trees– Normal Forms
Jaya Krishna, M.Tech, Assistant Professor
More on Context Free Grammars
Context-Free Languages
Context Free Pumping Lemma Some languages are not context free!
Properties of Regular Languages
Chapter Fourteen: The Context-Free Frontier
7. Properties of Context-Free Languages
Finite Automata and Formal Languages
COSC 3340: Introduction to Theory of Computation
Properties of Regular Languages
The Pumping Lemma for CFL’s
Pumping Lemma for Context-free Languages
Properties of Context-Free Languages
Department of Computer Science & Engineering
The Pumping Lemma for CFL’s
Bottom-up parsing Pumping Theorem for CFLs
Midterm #2 — Review problems
COSC 3340: Introduction to Theory of Computation
CS21 Decidability and Tractability
Chapter 2 Context-Free Language - 02
The Pumping Lemma for CFL’s
Applications of Regular Closure
Automata, Grammars and Languages
Pumping Theorem for CFLs
The Pumping Lemma for CFL’s
Presentation transcript:

Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the form A –> BC or A –> a where A, B and C are all in V and a is in T, then G is in Chomsky Normal Form (CNF). Example: (not quite!) S –> AB | BA | aSb A –> a B –> b Theorem: Let L be a CFL. Then L – {ε} is a CFL. Theorem: Let L be a CFL not containing {ε}. Then there exists a CNF grammar G such that L = L(G).

Proof: By induction on h(T) (exercise: T is a binary tree). CNF: A –> BC A –> a Definition: Let T be a tree. Then the height of T, denoted h(T), is defined as follows: If T consists of a single vertex then h(T) = 0 If T consists of a root r and subtrees T1, T2, … Tk, then h(T) = maxi{h(Ti)} + 1 Lemma: Let G be a CFG in CNF. In addition, let w be a string of terminals where A=>*w and w has a derivation tree T. If T has height h(T)1, then |w|  2h(T)-1. Proof: By induction on h(T) (exercise: T is a binary tree). Corollary: Let G be a CFG in CNF, and let w be a string in L(G). If |w|  2k, where k  0, then any derivation tree for w using G has height at least k+1. Proof: Follows from the lemma. What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Lemma: Let G be a CFG in CNF Lemma: Let G be a CFG in CNF. In addition, let w be a string of terminals where A=>*w and w has a derivation tree T. If T has height h(T)1, then |w|  2h(T)-1. Internal nodes are non-terminals  h(T)=5 a a a |w| = 6 in this case, maximum possible = 24 last branch is always single A->a a |w|  2h(T)-1 What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted. Leaves are terminals constituting the string a a

Pumping Lemma for Context-Free Languages Let G = (V, T, P, S) be a CFG in CNF, and let n = 2|V|. If z is a string in L(G) and |z|  n, then there exist substrings u, v, w, x and y in T* such that z=uvwxy and: |vx|  1 (i.e., |v| + |x|  1, or, non-null) |vwx|  n (the loop in generating this substring) uviwxiy are in L(G), for all i  0 Note: u or y could be of any length, may be ε vwx is in the middle, of size >0 Note the difference with Regular Language pumping lemma

By definition such a tree contains a path of length at least k+1. Proof: Since |z|  n = 2k, where k = |V|, it follows from the corollary that any derivation tree for z has height at least k+1. By definition such a tree contains a path of length at least k+1. Consider the longest such path in the tree T: S t yield of this tree T is string z Such a path has: Length of path t is |t|  k+1 (i.e., number of edges in the path t is  k+1) At least k+2 nodes on the path t 1 terminal, at the end of the path t At least k+1 non-terminals What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Second occurrence of non-terminal A A First occurrence A t Since there are only k non-terminals in the grammar, and since k+1 or more non-terminals appear on this long path, it follows that some non-terminal (and perhaps many) appears at least twice on this path. Consider the first non-terminal (from bottom) that is repeated, when traversing the path from the leaf to the root. S Second occurrence of non-terminal A A First occurrence A t This path, and the non-terminal A will be used to break up the string z. What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

In this case u = cd and y =f f a Where are v, w, and x? Generic Description: S A u v w x y Example: E F C D A F c d G G f F A g In this case u = cd and y =f f a Where are v, w, and x? What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Cut out the subtree rooted at A: S u y S =>* uAy (1) Example: E F C D A F c d f S =>* cdAf What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Consider the subtree rooted at A: A A G G A F A g v x f a Cut out the subtree rooted at the first occurrence of A: G G F A g v x f A =>* vAx (2) A =>* fAg What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Consider the smallest subtree rooted at A: A A a w A =>* w (3) A =>* a Collectively (1), (2) and (3) give us: S =>* uAy (1) =>* uvAxy (2) =>* uvwxy (3) =>* z since z = uvwxy Lowest appearance in parse tree  What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

In addition, (2) also tells us: S =>* uAy (1) =>* uvAxy (2) =>* uv (vAx) xy // by using the rules that make A =>* vAx =>* uv2Ax2y (2) =>* uv2wx2y (3) More generally: S =>* uviwxiy for all i ≥1, And also: =>* uwy (3) // by A =>* w here, i=0 Hence: S =>* uviwxiy for all i ≥ 0 What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Consider the statement of the Pumping Lemma: What is n? n = 2k, where k is the number of non-terminals in the grammar. Why is |v| + |x|  1? A v w x Since the height of this subtree is  2, the first production is A->V1V2. Since no non-terminal derives the empty string (in CNF), either V1 or V2 must derive a non-empty v or x. More specifically, if w is generated by V1, then x contains at least one symbol, and if w is generated by V2, then v contains at least one symbol. At least, A->AV, or A->VA, and V->a What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Why is |vwx|  n? Observations: A v w x Remember, n = 2k, k #non-terminals Observations: The repeated variable was the first repeated variable on the path from the bottom, and therefore (by the pigeon-hole principle) the path from the leaf to the second occurrence of the non-terminal has length at most k+1. Since the path was the largest in the entire tree, this path is the longest in the subtree rooted at the second occurrence of the non-terminal. Therefore the subtree has height (k+1). From the lemma, the yield of the subtree has length  2k=n.  A v w x What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Use of CFL Pumping Lemma

Closure Properties for Context-Free Languages Theorem: The CFLs are closed with respect to the union, concatenation and Kleene star operations. Proof: (details left as an exercise) Let L1 and L2 be CFLs. By definition there exist CFGs G1 and G2 such that L1 = L(G1) and L2 = L(G2). For union, show how to construct a grammar G3 such that L(G3) = L(G1) U L(G2). For concatenation, show how to construct a grammar G3 such that L(G3) = L(G1)L(G2). For Kleene star, show how to construct a grammar G3 such that L(G3) = L(G1)*. 

Theorem: The CFLs are not closed with respect to intersection. Proof: (counter example) Let L1 = {aibicj | i,j  0} and L2 = {aibjcj | i,j  0} Note that both of the above languages are CFLs. If the CFLs were closed with respect to intersection then would have to be a CFL. But this is equal to: {aibici | i  0} which is not a CFL.  What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.

Theorem: The CFLs are NOT closed with respect to complementation. Lemma: Let L1 and L2 be subsets of Σ*. Then . Proof: (by contradiction) Suppose that the CFLs were closed with respect to complementation, and let L1 and L2 be CFLs. Then: would be a CFL But by the lemma: a contradiction.

Proof: (exercise – sort of)  Theorem: Let L be a CFL and let R be a regular language. Then is a CFL. Proof: (exercise – sort of)  Question: Is regular? Answer: Not always. Let L = {aibi | i >= 0} and R = {aibj | i,j >= 0}, then which is not regular. What is rule #10 used for – emptying the stack. What is rule #9 used for – emptying the stack. Why do rules #3 and #6 have options – you don’t know where the middle of the string is. Why don’t rules #4 and #5 have similar options – suppose rule #4 had option (q2, λ). Then 0100 would be accepted.