Pumping Theorem for CFLs

Slides:



Advertisements
Similar presentations
Context-Free and Noncontext-Free Languages
Advertisements

The Pumping Lemma for CFL’s
Theory of Computation CS3102 – Spring 2014 A tale of computers, math, problem solving, life, love and tragic death Nathan Brunelle Department of Computer.
1 Introduction to Computability Theory Lecture7: The Pumping Lemma for Context Free Languages Prof. Amos Israeli.
Lecture 15UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 15.
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
CS5371 Theory of Computation Lecture 9: Automata Theory VII (Pumping Lemma, Non-CFL)
Foundations of (Theoretical) Computer Science Chapter 2 Lecture Notes (Section 2.3: Non-Context-Free Languages) David Martin With.
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
1 Background Information for the Pumping Lemma for Context-Free Languages Definition: Let G = (V, T, P, S) be a CFL. If every production in P is of the.
Today Chapter 2: (Pushdown automata) Non-CF languages CFL pumping lemma Closure properties of CFL.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Pushdown Automata (PDA) Part 2 While I am distributing graded exams: Design a PDA to accept L = { a m b n : m  n; m, n > 0} MA/CSSE 474 Theory of Computation.
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
Pushdown Automata (PDA) Intro
The Pumping Lemma for Context Free Grammars. Chomsky Normal Form Chomsky Normal Form (CNF) is a simple and useful form of a CFG Every rule of a CNF grammar.
Context-Free and Noncontext-Free Languages Chapter 13 1.
1 L= { w c w R : w  {a, b}* } is accepted by the PDA below. Use a construction like the one for intersection for regular languages to design a PDA that.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 11 Midterm Exam 2 -Context-Free Languages Mälardalen University 2005.
Pushdown Automata Chapters Generators vs. Recognizers For Regular Languages: –regular expressions are generators –FAs are recognizers For Context-free.
Cs3102: Theory of Computation Class 8: Non-Context-Free Languages Spring 2010 University of Virginia David Evans.
Context-Free and Noncontext-Free Languages Chapter 13 1.
Non-CF Languages The language L = { a n b n c n | n  0 } does not appear to be context-free. Informal: A PDA can compare #a’s with #b’s. But by the time.
Context-Free and Noncontext-Free Languages Chapter 13.
Critique this PDA for L= { u u R v v R : u ∈ {0,1}* and v ∈ {0,1}+ } u0εu0 u1εu1 uεεvε v00vε v11vε vεεfε sεεtε t0εt0 t1εt1 t00tε t11tε tεεuε After you.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Transparency No. P2C5-1 Formal Language and Automata Theory Part II Chapter 5 The Pumping Lemma and Closure properties for Context-free Languages.
January 20, 2016CS21 Lecture 71 CS21 Decidability and Tractability Lecture 7 January 20, 2016.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 2 Context-Free Languages Some slides are in courtesy.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen Department of Computer Science University of Texas-Pan American.
Pushdown Automata Chapter 12. Recognizing Context-Free Languages Two notions of recognition: (1) Say yes or no, just like with FSMs (2) Say yes or no,
 2004 SDU Lecture8 NON-Context-free languages.  2004 SDU 2 Are all languages context free? Ans: No. # of PDAs on  < # of languages on  Pumping lemma:
Bottom-up parsing Pumping Theorem for CFLs MA/CSSE 474 Theory of Computation.
Complexity and Computability Theory I Lecture #12 Instructor: Rina Zviel-Girshin Lea Epstein.
MA/CSSE 474 Theory of Computation How many regular/non-regular languages are there? Closure properties of Regular Languages (if there is time) Pumping.
Context-Free and Noncontext-Free Languages Chapter 13.
Computability Joke. Context-free grammars Parsing. Chomsky
Properties of Context-Free Languages
Syntax Specification and Analysis
Deterministic FA/ PDA Sequential Machine Theory Prof. K. J. Hintz
PDAs Accept Context-Free Languages
Lecture 17 Oct 25, 2011 Section 2.1 (push-down automata)
PARSE TREES.
Summary.
Non-Context-Free Languages
Chapter Fourteen: The Context-Free Frontier
Definition: Let G = (V, T, P, S) be a CFL
Context-Free Grammars
COSC 3340: Introduction to Theory of Computation
Pumping Lemma for Context-free Languages
Properties of Context-Free Languages
The Pumping Lemma for CFL’s
Bottom-up parsing Pumping Theorem for CFLs
MA/CSSE 474 Theory of Computation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
MA/CSSE 474 Theory of Computation
COSC 3340: Introduction to Theory of Computation
CS21 Decidability and Tractability
Chapter 2 Context-Free Language - 02
MA/CSSE 474 Theory of Computation
CS21 Decidability and Tractability
Pumping Theorem Examples
Pushdown Automata (PDA) Part 2
Recap lecture 42 Row language, nonterminals defined from summary table, productions defined by rows, rules for defining productions, all possible productions.
MA/CSSE 474 Theory of Computation
More About Nondeterminism
MA/CSSE 474 Theory of Computation
Intro to Theory of Computation
Presentation transcript:

Pumping Theorem for CFLs MA/CSSE 474 Theory of Computation Bottom-up parsing Pumping Theorem for CFLs Exam next Friday!

Recap: Going One Way Lemma: Each context-free language is accepted by some PDA. Proof (by construction): The idea: Let the stack do the work. Two approaches: Top down Bottom up

Order of derivation discovery Top-down VS Bottom-up Approach Top-down Bottom-up Read the input string left-to-right Derivation leftmost rightmost Order of derivation discovery forward backward

Bottom-Up PDA The outline of M is: M = ({p, q}, , V, , p, {q}), where  contains: ● The shift transitions: ((p, c, ), (p, c)), for each c  . ● The reduce transitions: ((p, , (s1s2…sn.)R), (p, X)), for each rule X  s1s2…sn. in G. Undoes an application of this rule. ● The finish-up transition: ((p, , S), (q, )). Top-down parser discovers a leftmost derivation of the input string (If any). Bottom-up parser discovers a rightmost derivation (in reverse order)

Bottom-Up PDA Example: id + id * id The idea: Let the stack keep track of what has been found. Discover a rightmost derivation in reverse order. Start with the string of terminals and attempt to "pull it back" (reduce) to S. (1) E  E + T (2) E  T (3) T  T  F (4) T  F (5) F  (E) (6) F  id Reduce Transitions: (1) (p, , T + E), (p, E) (2) (p, , T), (p, E) (3) (p, , F  T), (p, T) (4) (p, , F), (p, T) (5) (p, , )E( ), (p, F) (6) (p, , id), (p, F) Shift Transitions: (7) (p, id, ), (p, id) (8) (p, (, ), (p, () (9) (p, ), ), (p, )) (10) (p, +, ), (p, +) (11) (p, , ), (p, ) Example: id + id * id When the right side of a production is on the top of the stack, we can replace it by the left side of that production… A bottom-up parser is sometimes called a shift-reduce parser. Show how it works on id + id * id State stack remaining input transition to use p  id + id * id 7 p id + id * id 6 p F + id * id 4 p T + id * id 2 p E + id * id 10 p +E id * id 7 p id+E * id 6 p F+E * id 4 p T+E * id 11 p *T+E id 7 p id*T+E  6 p F*T+E  3 p T+E  1 p E  0 q   …or not! That's where the nondeterminism comes in: choice between shift and reduce; choice between two reductions.

Hidden during class, revealed later: Solution to bottom-up example A bottom-up parser is sometimes called a shift-reduce parser. Show how it works on id + id * id State stack remaining input transition to use p  id + id * id 7 p id + id * id 6 p F + id * id 4 p T + id * id 2 p E + id * id 10 p +E id * id 7 p id+E * id 6 p F+E * id 4 p T+E * id 11 p *T+E id 7 p id*T+E  6 p F*T+E  3 p T+E  1 p E  0 q   Note that the top of the stack is on the left. This is what I should have done in the class for sections 1 and 2 (and I did do it for section 3).

Acceptance by PDA  derived from CFG Much more complex than the other direction. Nonterminals in the grammar that we build from the PDA M are based on a combination of M's states and stack symbols. It gets very messy. Takes 9½ dense pages in the textbook (265-274). I think we can use our limited course time better.

How Many Context-Free Languages Are There? (we had a slide just like this for regular languages) Theorem: For any finite input alphabet Σ, there is a countably infinite number of CFLs over Σ. Proof: ● Upper bound: we can lexicographically enumerate all the CFGs. ● Lower bound: Each of {a}, {aa}, {aaa}, … is a CFL. The number of languages over Σ is uncountable. Thus there are more languages than there are context-free languages. So there must be some languages that are not context-free.

Languages That Are and Are Not Context-Free a*b* is regular. AnBn = {anbn : n  0} is context-free but not regular. AnBnCn = {anbncn : n  0} is not context-free. We will show this soon. Is every regular language also context-free? Yes. A Regular grammar is a CFG. A FSM is a PDA that does not use its stack.

Showing that L is Context-Free Techniques for showing that a language L is context-free: 1. Exhibit a CFG for L. 2. Exhibit a PDA for L. 3. Use the closure properties of context-free languages. Unfortunately, these are weaker than they are for regular languages. union, reverse, concatenation, Kleene star intersection of a CFL with a regular language NOT intersection, complement, set difference

CFL Pumping Theorem

Show that L is Not Context-Free Recall the basis for the pumping theorem for regular languages: A DFSM M. If a string is longer than the number of M's states… Why would it be hard to use a PDA to show that long strings from a CFL can be pumped? For a CFL, we will use the grammar instead of the PDA, but the idea is the same: If the string is long enough, something is repeated, and can be pumped.

Some Tree Geometry Basics The height h of a tree is the length of the longest path from the root to any leaf. The branching factor b of a tree is the largest number of children associated with any node in the tree. Theorem: The length of the yield (concatenation of leaf nodes) of any tree T with height h and branching factor b is  bh. Shown in CSSE 230. We are not talking about binary trees here, we are talking about trees where different nodes may have different numbers of children

A Review of Parse Trees A parse tree, (a.k.a. derivation tree) derived from a grammar G = (V, , R, S), is a rooted, ordered tree in which: ● Every leaf node is labeled with an element of   {}, ● The root node is labeled S, ● Every interior node is labeled with an element of N (i.e., V - ), ● If m is a non-leaf node labeled X and the children of m (left-to-right on the tree) are labeled x1, x2, …, xn, then the rule X  x1 x2 … xn is in R. Comment on last statement: If this were not true, it would not be a parse tree for a string in G.

From Grammars to Trees Given a context-free grammar G: ● Let n be the number of nonterminal symbols in G. ● Let b be the branching factor of G Suppose that a tree T is generated by G and no nonterminal appears more than once on any path from the root: The maximum height of T is: The maximum length of T’s yield is: Branching factor of G: largest number of symbols on the right side of a rule in R. Answers: n , bn.

The Context-Free Pumping Theorem We use parse trees, not machines, as the basis for our argument. Let L = L(G), and let wL. Let T be a parse tree for w such that has the smallest possible number of nodes among all trees based on a derivation of w from G. Suppose L(G) contains a string w such that |w| is greater than bn. Then its parse tree must look like (for some nonterminal X): X[1] is the lowest place in the tree for which this happens. I.e., there is no other X in the derivation of x from X[2]. Recall: If no nonterminal appears twice in a path, then the length of the string is <= bn So there has to be an X like this. Pick a path with at least two Xs and let X[1] and X[2] be the two lowest occurrences of X in the tree. x is the yield from X[2] v and y are the rest of the yield from X[1] u and z are the rest of w The Pumping Theorem is sometimes called the "uvxyz theorem"

The Context-Free Pumping Theorem There is another derivation in G: S * uXz * uxz, in which, at X[1], the nonrecursive rule that leads to x is used instead of the recursive one that leads to vXy. So uxz is also in L(G). Derivation of w At X[1] We derive x instead of vXwy

The Context-Free Pumping Theorem There are infinitely many derivations in G, such as: S * uXz * uvXyz * uvvXyyz * uvvxyyz Those derivations produce the strings: uv2xy2z, uv3xy3z, uv4xy4z, … So all of those strings are also in L(G).

The Context-Free Pumping Theorem If rule1 is X  Xa, we could have v = . If rule1 is X  aX, we could have y = . But it is not possible that both v and y are . If they were, then the derivation S * uXz * uxz would also yield w and it would create a parse tree with fewer nodes. But that contradicts the assumption that we started with a parse tree for w with the smallest possible number of nodes.

The Context-Free Pumping Theorem The height of the subtree rooted at [1] is at most: So |vxy|  . Answers: n+1, bn+1

The Context-Free Pumping Theorem If L is a context-free language, then k  1 ( strings w  L, where |w|  k (u, v, x, y, z ( w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is in L)))). Write it in contrapositive form. Try to do this before going on. Now we want to write it in contrapositive form, so we can use it to show a language is NOT context-free. k  1 ( string w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is not in L)))).

Pumping Theorem contrapositive We want to write it in contrapositive form, so we can use it to show a language is NOT context-free. Original: If L is a context-free language, then k  1 ( strings w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is in L)))). Contrapositive: If k  1 ( string w  L, where |w|  k (u, v, x, y, z (w = uvxyz, q  0 (uvqxyqz is not in L)))), then L is not a CFL.

Regular vs. CF Pumping Theorems Similarities: ● We don't get to choose k. ● We choose w, the string to be pumped, based on k. ● We don't get to choose how w is broken up (into xyz or uvxyz) ● We choose a value for q that shows that w isn’t pumpable. ● We may apply closure theorems before we start. Things that are different in CFL Pumping Theorem: ● Two regions, v and y, must be pumped in tandem. ● We don’t know anything about where in the strings v and y will fall in the string w. All we know is that they are reasonably “close together”, i.e., |vxy|  k. ● Either v or y may be empty, but not both.

Pumping Theorem contrapositive We want to write it in contrapositive form, so we can use it to show a language is NOT context-free. Original: If L is a context-free language, then k  1 ( strings w  L, where |w|  k (u, v, x, y, z (w = uvxyz, vy  , |vxy|  k, and q  0 (uvqxyqz is in L)))). Contrapositive: If k  1 ( string w  L, where |w|  k (u, v, x, y, z (w = uvxyz, q  0 (uvqxyqz is not in L)))), then L is not a CFL. Example: AnBnCn

An Example of Pumping: AnBnCn AnBnCn = {anbncn, n 0} Choose w = ak bk ck (we don't get to choose the k) 1 | 2 | 3 (the regions: all a's, all b's, all c's) If either v or y spans two regions, then let q = 2 (i.e., pump in once). The resulting string will have letters out of order and thus not be in AnBnCn. Other possibilities for (v region, y region) (1, 1): q=2 gives us more a's than b's or c's. (2, 2) and (3,3) similar. (1, 2): q=2 gives more a's and b's than c's. (2, 3) is similar. (1, 3): Impossible because |vxy| must be  k.

An Example of Pumping: { , n 0} L = { , n  0} The elements of L: n w  1 a1 2 a4 3 a9 4 a16 5 a25 6 a36

Nested and Cross-Serial Dependencies PalEven = {wwR : w  {a, b}*} a a b b a a The dependencies are nested. Context-free. WcW = {wcw : w  {a, b}*} a a b c a a b Cross-serial dependencies. Not context-free.

Work with one or two other students on these WcW = {wcw : w  {a, b}*} {(ab)nanbn : n > 0} {x#y : x, y  {0, 1}* and x  y} Answers are on the next two hidden slides