Presentation is loading. Please wait.

Presentation is loading. Please wait.

Context-Free and Noncontext-Free Languages Chapter 13 1.

Similar presentations


Presentation on theme: "Context-Free and Noncontext-Free Languages Chapter 13 1."— Presentation transcript:

1 Context-Free and Noncontext-Free Languages Chapter 13 1

2 Languages That Are and Are Not Context-Free a * b * is regular. A n B n = { a n b n : n  0} is context-free but not regular. A n B n C n = { a n b n c n : n  0} is not context-free. 2

3 Of Languages and Machines 3

4 The Regular and the CF Languages Theorem: The regular languages are a proper subset of the context-free languages. Proof: In two parts: Every regular language is CF. There exists at least one language that is CF but not regular. 4

5 The Regular and the CF Languages Lemma: Every regular language is CF. Proof: Every FSM is (trivially) a PDA: Given an FSM M = (K, , , s, A) and elements of  of the form: ( p, c, q ) old state, input, new state Construct a PDA M' = (K, , {  }, , s, A). Each (p, c, q) becomes: (( p, c,  ), (q,  )) old state, input, don't new statedon't look atpush on stackstack In other words, we just don’t use the stack. 5

6 There Exists at Least One Language that is CF but Not Regular Lemma: There exists at least one language that is CF but not regular Proof: { a n b n, n  0} is context-free but not regular. So the regular languages are a proper subset of the context-free languages. 6

7 How Many Context-Free Languages Are There? Theorem: There is a countably infinite number of CFLs. Proof: ● Upper bound: we can lexicographically enumerate all the CFGs. ● Lower bound: {a}, {aa}, {aaa}, … are all CFLs. 7

8 How Many Context-Free Languages Are There? There is an uncountable number of languages. Thus there are more languages than there are context- free languages. So there must exist some languages that are not context-free. Example: { a n b n c n : n  0} 8

9 Showing that L is Context-Free Techniques for showing that a language L is context-free: 1. Exhibit a context-free grammar for L. 2. Exhibit a PDA for L. 3. Use the closure properties of context-free languages. Unfortunately, these are weaker than they are for regular languages. 9

10 Showing that L is Not Context-Free Remember the pumping argument for regular languages: 10

11 A Review of Parse Trees A parse tree, derived by a grammar G = (V, , R, S), is a rooted, ordered tree in which: ● Every leaf node is labeled with an element of   {  }, ● The root node is labeled S, ● Every other node is labeled with some element of V - , ● If m is a nonleaf node labeled X and the children of m are labeled x 1, x 2, …, x n, then the rule X  x 1, x 2, …, x n is in R. 11

12 Some Tree Basics The height of a tree is the length of the longest path from the root to any leaf. The branching factor of a tree is the largest number of daughter nodes associated with any node in the tree. Theorem: The length of the yield of any tree T with height h and branching factor b is  b h. 12 number of leaf nodes

13 From Grammars to Trees Given a context-free grammar G: ● Let n be the number of nonterminal symbols in G. ● Let b be the branching factor of G Suppose that T is generated by G and no nonterminal appears more than once on any path: The maximum height of T is: The maximum length of T’s yield is: 13

14 The Context-Free Pumping Theorem This time we use parse trees, not automata as the basis for our argument. If w is “long”, then its parse trees must look like: Choose one such tree such that there’s no other with fewer nodes. 14

15 The Context-Free Pumping Theorem There is another derivation in G: S  * uXz  * uxz, in which, at the point labeled [1], the nonrecursive rule 2 is used. So uxz is also in L(G). 15

16 The Context-Free Pumping Theorem There are infinitely many derivations in G, such as: S  * uXz  * uvXyz  * uvvXyyz  * uvvxyyz Those derivations produce the strings: uv 2 xy 2 z, uv 3 xy 3 z, … So all of those strings are also in L(G). 16

17 The Context-Free Pumping Theorem If rule 1 = X  X a, we could get v = . If rule 1 = X  a X, we could get y = . But it is not possible that both v and y are . If they were, then the derivation S  * uXz  * uxz would also yield w and it would create a parse tree with fewer nodes. But that contradicts the assumption that we started with a tree with the smallest possible number of nodes. 17

18 The Context-Free Pumping Theorem The height of the subtree rooted at [1] is at most: 18

19 The Context-Free Pumping Theorem The height of the subtree rooted at [1] is at most: n + 1 So |vxy|  b n + 1. 19

20 The Context-Free Pumping Theorem If L is a context-free language, then  k  1 (  strings w  L, where |w|  k (  u, v, x, y, z (w = uvxyz, vy  , |vxy|  k and  q  0 (uv q xy q z is in L)))). 20

21 k serves two roles: ● How long must w be to guarantee it is pumpable? ● What’s the bound on |vxy|? Let n be the number of nonterminals in G. Let b be the branching factor of G. What Is k? 21

22 If height(T) > n, then some nonterminal occurs more than once on some path. So T is pumpable. If height(T)  n, then |uvxyz|  b n. So if |uvxyz| > b n, w = uvxyz must be pumpable. How Long Must w be? 22

23 Assume that we are considering the bottom-most two instances of a repeated nonterminal. Then the yield of the upper one has length at most b n+1. Assuming b  2, b n+1 > b n. So let k = b n+1. What’s the Bound on |vxy|? 23

24 The Context-Free Pumping Theorem If L is a context-free language, then  k  1, such that  strings w  L, where |w|  k,  u, v, x, y, z, such that: w = uvxyz,and vy  , and |vxy|  k, and  q  0, uv q xy q z is in L. Proof: L is generated by some CFG G = (V, , R, S) with n nonterminal symbols and branching factor b. Let k be b n + 1. The longest string that can be generated by G with no repeated nonterminals in the resulting parse tree has length b n. Assuming that b  2, it must be the case that b n + 1 > b n. So let w be any string in L(G) where |w|  k. Let T be any smallest parse tree for w. T must have height at least n + 1. Choose some path in T of length at least n + 1. Let X be the bottom-most repeated nonterminal along that path. Then w can be rewritten as uvxyz. The tree rooted at [1] has height at most n + 1. Thus its yield, vxy, has length less than or equal to b n + 1, which is k. vy   since if vy were  then there would be a smaller parse tree for w and we chose T so that that wasn’t so. uxz must be in L because rule 2 could have been used immediately at [1]. For any q  1, uv q xy q z must be in L because rule 1 could have been used q times before finally using rule 2. 24

25 Regular vs CF Pumping Theorems Similarities: ● We choose w, the string to be pumped. ● We choose a value for q that shows that w isn’t pumpable. ● We may apply closure theorems before we start. Differences: ● Two regions, v and y, must be pumped in tandem. ● We don’t know anything about where in the strings v and y will fall. All we know is that they are reasonably “close together”, i.e., |vxy|  k. ● Either v or y could be empty, although not both. 25

26 An Example of Pumping: A n B n C n A n B n C n = { a n b n c n, n  0} 26

27 An Example of Pumping: A n B n C n A n B n C n = { a n b n c n, n  0} Choose w = a k b k c k 1 | 2 | 3 27

28 An Example of Pumping: A n B n C n A n B n C n = { a n b n c n, n  0} Choose w = a k b k c k 1 | 2 | 3 If either v or y spans regions, then let q = 2 (i.e., pump in once). The resulting string will have letters out of order and thus not be in A n B n C n. If both v and y each contain only one distinct character then set q to 2. Additional copies of at most two different characters are added, leaving the third unchanged. There are no longer equal numbers of the three letters, so the resulting string is not in A n B n C n. 28

29 An Example of Pumping: {, n  0} L = {, n  0} The elements of L: nw 0  1 a1a1 2 a4a4 3 a9a9 4 a 16 5 a 25 6 a 36 29

30 An Example of Pumping: {, n  0} L = {, n  0} If n = k 2, then n 2 = k 4. Let w =. 30

31 An Example of Pumping: { : n  0} L = {, n  0}. If n = k 2, then n 2 = k 4. Let w =. vy = a p, for some nonzero p. Set q to 2. The resulting string, s, is. It must be in L. But it isn’t because it is too short: w:next longer string in L: (k 2 ) 2 a ’s (k 2 + 1) 2 a ’s k 4 a ’sk 4 + 2k 2 + 1 a ’s For s to be in L, p = |vy| would have to be at least 2k 2 + 1. But |vxy|  k, so p can’t be that large. Thus s is not in L and L is not context-free. 31

32 Another Example of Pumping L = { a n b m a n, n, m  0 and n  m}. Let w = 32

33 Another Example of Pumping L = { a n b m a n, n, m  0 and n  m}. Let w = a k b k a k aaa … aaabbb … bbbaaa … aaa | 1 | 2 | 3 | 33

34 Nested and Cross-Serial Dependencies PalEven = {ww R : w  { a, b }*} a a b b a a The dependencies are nested. W c W = {w c w : w  { a, b }*} a a b c a a b Cross-serial dependencies. 34

35 W c W = {w c w : w  { a, b }*} 35

36 Let w = a k b k ca k b k. aaa … aaabbb … bbbcaaa … aaabbb … bbb | 1 | 2 |3| 4 | 5 | Call the part before c the left side and the part after c the right side. ● If v or y overlaps region 3, set q to 0. The resulting string will no longer contain a c. ● If both v and y occur before region 3 or they both occur after region 3, then set q to 2. One side will be longer than the other. ● If either v or y overlaps region 1, then set q to 2. In order to make the right side match, something would have to be pumped into region 4. Violates |vxy|  k. ● If either v or y overlaps region 2, then set q to 2. In order to make the right side match, something would have to be pumped into region 5. Violates |vxy|  k. W c W = {w c w : w  { a, b }*} 36

37 Variable Declaration and Use W c W = {w c w : w  { a, b }*}. string winniethepooh; winniethepooh = “bearofverylittlebrain”; 37

38 Closure Theorems for Context-Free Languages The context-free languages are closed under: ● Union ● Concatenation ● Kleene star ● Reverse ● Letter substitution 38

39 Closure Under Union Let G 1 = (V 1,  1, R 1, S 1 ), and G 2 = (V 2,  2, R 2, S 2 ). Assume that G 1 and G 2 have disjoint sets of nonterminals, not including S. Let L = L(G 1 )  L(G 2 ). We can show that L is CF by exhibiting a CFG for it: 39

40 Closure Under Union Let G 1 = (V 1,  1, R 1, S 1 ), and G 2 = (V 2,  2, R 2, S 2 ). Assume that G 1 and G 2 have disjoint sets of nonterminals, not including S. Let L = L(G 1 )  L(G 2 ). We can show that L is CF by exhibiting a CFG for it: G = (V 1  V 2  {S},  1   2, R 1  R 2  {S  S 1, S  S 2 }, S) 40

41 Closure Under Concatenation Let G 1 = (V 1,  1, R 1, S 1 ), and G 2 = (V 2,  2, R 2, S 2 ). Assume that G 1 and G 2 have disjoint sets of nonterminals, not including S. Let L = L(G 1 )L(G 2 ). We can show that L is CF by exhibiting a CFG for it: 41

42 Closure Under Concatenation Let G 1 = (V 1,  1, R 1, S 1 ), and G 2 = (V 2,  2, R 2, S 2 ). Assume that G 1 and G 2 have disjoint sets of nonterminals, not including S. Let L = L(G 1 )L(G 2 ). We can show that L is CF by exhibiting a CFG for it: G = (V 1  V 2  {S},  1   2, R 1  R 2  {S  S 1 S 2 }, S) 42

43 Closure Under Kleene Star Let G = (V, , R, S 1 ). Assume that G does not have the nonterminal S. Let L = L(G)*. We can show that L is CF by exhibiting a CFG for it: 43

44 Closure Under Kleene Star Let G = (V, , R, S 1 ). Assume that G does not have the nonterminal S. Let L = L(G)*. We can show that L is CF by exhibiting a CFG for it: G = (V 1  {S},  1, R 1  {S  , S  S S 1 }, S) 44

45 Closure Under Reverse L R = {w   * : w = x R for some x  L}. Let G = (V, , R, S) be in Chomsky normal form. Every rule in G is of the form X  BC or X  a, where X, B, and C are elements of V -  and a  . ● X  a: L(X) = {a}. {a} R = {a}. ● X  BC: L(X) = L(B)L(C). (L(B)L(C)) R = L(C) R L(B) R. Construct, from G, a new grammar G, such that L(G) = L R : G = (V G,  G, R, S G ), where R is constructed as follows: ● For every rule in G of the form X  BC, add to R the rule X  CB. ● For every rule in G of the form X  a, add to R the rule X  a. 45

46 What About Intersection and Complement? Closure under complement implies closure under intersection, since: L 1  L 2 =  (  L 1   L 2 ) But are the CFLs closed under either complement or intersection? We proved closure for regular languages two different ways: 1. Given a DFSM for L, construct a DFSM for  L by swapping accepting and rejecting states. If closed under complement and union, must be closed under intersection. 2. Given automata for L 1 and L 2, construct an automaton for L 1  L 2 by simulating the parallel operation of the two original machines, using states that are the Cartesian product of the sets of states of the two original machines. Does either work here? 46

47 Closure Under Intersection The context-free languages are not closed under intersection: The proof is by counterexample. Let: L 1 = { a n b n c m : n, m  0} /* equal a ’s and b ’s. L 2 = { a m b n c n : n, m  0} /* equal b ’s and c ’s. Both L 1 and L 2 are context-free, since there exist straightforward context-free grammars for them. But now consider: L = L 1  L 2 = 47

48 Closure Under Intersection The context-free languages are not closed under intersection: The proof is by counterexample. Let: L 1 = { a n b n c m : n, m  0} /* equal a ’s and b ’s. L 2 = { a m b n c n : n, m  0} /* equal b ’s and c ’s. Both L 1 and L 2 are context-free, since there exist straightforward context-free grammars for them. But now consider: L = L 1  L 2 = { a n b n c n : n  0} 48

49 Closure Under Complement L 1  L 2 =  (  L 1   L 2 ) The context-free languages are closed under union, so if they were closed under complement, they would be closed under intersection (which they are not). 49

50 Closure Under Complement An Example  A n B n C n is context-free: But  (  A n B n C n ) = A n B n C n is not context-free. 50

51 Closure Under Difference Are the context-free languages closed under difference? 51

52 Closure Under Difference Are the context-free languages closed under difference?  L =  * - L.  * is context-free. So, if the context-free languages were closed under difference, the complement of any context-free language would necessarily be context-free. But we just showed that that is not so. 52

53 The Intersection of a Context-Free Language and a Regular Language is Context-Free L = L(M 1 ), a PDA = (K 1, ,  1,  1, s 1, A 1 ). R = L(M 2 ), a deterministic FSM = (K 2, , , s 2, A 2 ). We construct a new PDA, M 3, that accepts L  R by simulating the parallel execution of M 1 and M 2. M = (K 1  K 2, ,  1, , (s 1, s 2 ), A 1  A 2 ). Insert into  : For each rule ((q 1,a,  ), (p 1,  )) in  1, and each rule (q 2, a, p 2 ) in , (((q 1, q 2 ), a,  ), ((p 1, p 2 ),  )). For each rule ((q 1, ,  ), (p 1,  ) in  1, and each state q 2 in K 2, (((q 1, q 2 ), ,  ), ((p 1, q 2 ),  )). This works because: we can get away with only one stack. 53

54 Theorem: The difference (L 1 – L 2 ) between a context-free language L 1 and a regular language L 2 is context-free. Proof: L 1 – L 2 = L 1   L 2. If L 2 is regular then so is  L 2. If L 1 is context-free, so is L 1   L 2. The Difference between a Context-Free Language and a Regular Language is Context-Free 54

55 Let: L = { a n b n : n  0 and n  1776}. Alternatively: L = { a n b n : n  0} – { a 1776 b 1776 }. { a n b n : n  0} is context-free. { a 1776 b 1776 } is regular. An Example: A Finite Number of Exceptions 55

56 One Closure Theorem: If L 1 and L 2 are context free, then so is L 3 = L 1  L 2. But what if L 3 and L 1 are context free? What can we say about L 2 ? L 3 = L 1  L 2. Don’t Try to Use Closure Backwards 56

57 One Closure Theorem: If L 1 and L 2 are context free, then so is L 3 = L 1  L 2. But what if L 3 and L 1 are context free? What can we say about L 2 ? L 3 = L 1  L 2. Example: a n b n c * = a n b n c *  a n b n c n. Don’t Try to Use Closure Backwards 57

58 Using the Closure Theorems with the Pumping Theorem Let WW = {ww : w  { a, b }* }. Let’s try pumping: Choose w = ( ab ) 2k (Don’t get confused about the two uses of w.) w ababab … abababababab … ababababab But this pumps fine with v = and y = 58

59 Exploiting Regions WW = {ww : w  { a, b }* }. Choose the string a k ba k b. aaaaa ………………… baaaaaa …………….. b ww But this also pumps fine. 59

60 Make All Regions “Long” WW = {ww : w  { a, b }* }. Choose the string a k b k a k b k. aaa ….. aabb ……… bbaa …… aabb …….. b ww 1 2 3 4 Now we list the possibilities: (1, 1), (2, 2), (3, 3), (4, 4), (1, 2), (2, 3), (3, 4), (1, 3), (1, 4), (2, 4), (1/2, 2), (1/2, 3), (1/2, 4), (1/2, 2/3),… Whenever v or y spans regions, we’ll no longer have a string of the same form, but that’s okay given the definition of L. 60

61 Using Intersection with a Regular Language WW = {ww : w  { a, b }* }. Recall our last choice of w: a k b k a k b k. aaa ….. aabb ……… bbaa …… aabb …….. b ww 1 2 3 4 But let’s consider L' = L  61

62 Using Intersection with a Regular Language WW = {ww : w  { a, b }* }. But let’s consider L' = L  a * b * a * b *. L' is not context-free. Let w = a k b k a k b k. aaa ….. aabb ……… bbaa …… aabb …….. b ww 1 2 3 4 62

63 Another Example L = {w : w can be written as x # y = z : x, y, z  { 0, 1 }* and, if x, y, and z are viewed as binary numbers without leading zeros, x  y = z R }. For example, 100#1111=001111 is in L. 63

64 Another Example L = {w : w can be written as x # y = z : x, y, z  { 0, 1 }* and, if x, y, and z are viewed as binary numbers without leading zeros, x # y = z R }. Choose w = 10 k #1 k =0 k 1 k : 1 000 … 000 # 111 … 111 = 000 … 000111 …111 |1| 2 |3| 4 |5| 6 | 7 | Note that w is in L. If L is CF, so is L = L  10 * #1 * =0 * 1 *: 64

65 Another Example Choose w = 10 k #1 k =0 k 1 k : 1 000 … 000 # 111 … 111 = 000 … 000111 …111 |1| 2 |3| 4 |5| 6 | 7 | L = L  10 * #1 * =0 * 1 * is not CF: v or y overlaps 1, 3, or 5: v or y contains the boundary between 6 and 7: (2, 2), (4, 4), or (2, 4): (6, 6), (7, 7), or (6, 7): (4, 6): (2, 6), (2, 7) or (4, 7): 65

66 L = {w  { a, b, c }* : # a (w) = # b (w) = # c (w) } If L were context-free, then L = L  a * b * c * would also be context-free. But L = So neither is L. 66

67 Why are the Context-Free Languages Not Closed under Complement, Intersection and Subtraction But the Regular Languages Are? Given an NDFSM M 1, build an FSM M 2 such that L(M 2 ) =  L(M 1 ): 1. From M 1, construct an equivalent deterministic FSM M, using ndfsmtodfsm. 2. If M is described with an implied dead state, add the dead state and all required transitions to it. 3. Begin building M 2 by setting it equal to M. Then swap the accepting and the nonaccepting states. So: M 2 = (K M, ,  M, s M, K M - A M ). We could do the same thing for CF languages if we could do step 1, but we can’t. The need for nondeterminism is the key. 67


Download ppt "Context-Free and Noncontext-Free Languages Chapter 13 1."

Similar presentations


Ads by Google