Presentation is loading. Please wait.

Presentation is loading. Please wait.

Section 12.4 Context-Free Language Topics

Similar presentations


Presentation on theme: "Section 12.4 Context-Free Language Topics"— Presentation transcript:

1 Section 12.4 Context-Free Language Topics
Algorithm. Remove L-productions from grammars for langauges without L. Find nonterminals that derive L. For each production A  w construct all productions A  w’ where w’ is obtained from w by removing one or more occurrences of the nonterminals from Step 1. Combine the original productions with those of step 2 and eliminate any L-productions. Example. Remove L-productions from the grammar S  ABc A  aA | L B  bB | L. Solution. Step 1: The nonterminals A and B derive L. Step 2: From the production S  ABc we construct S  Bc | Ac | c. From the production A  aA we construct A  a. From the production B  bB we construct B  b. Step 3: S  ABc | Bc | Ac | c A  aA | a B  bB | b. Quiz. Remove L -productions from S  ABc | Ab | c A  ABa | L B  Bbc | L. Solution. S  ABc | Ab | c| Bc | Ac | b A  ABa | Ba | Aa | a B  Bbc | bc.

2 Chomsky Normal Form. Productions have one of the following forms
A  a (a a terminal) A  BC S  L (if L is in the language). Advantages: Parse trees are binary, which are easy to represent. Any string of length n > 0 can be derived in 2n – 1 steps. Algorithm. Transform to Chomsky normal form (with the additional property that no start symbol occurs on the right side of a production) 1. If the start symbol S occurs on some right side, create a new start symbol S´ and a new production S´  S. 2. Remove A  L (if A ≠ S) by previous algorithm. (If S  L is removed, add it back.) 3. Remove unit productions (i.e., A  B): If A  B or A + B, then construct productions A  w where B  w is not a unit production. Now remove all unit productions. 4. For each production whose right side has two or more symbols, replace all occurrrences of each terminal a with a new nonterminal A and also add the new production A  a. 5. Replace each production B  C1…Cn with n > 2 with B  C1D where D  C2 …Cn. Repeat this step until all right sides have length two.

3 Example. Construct a Chomsky normal form for the grammar
S  aSb | D D  Dc | L. Solution. Step 1: Add the production S´  S. Step 2: S´  S | L S  aSb | ab | D D  Dc | c. Step 3: S´  aSb | ab | Dc | c | L S  aSb | ab | Dc | c Step 4: S´  ASB | AB | DC | c | L S  ASB | AB | DC | c D  DC | c A  a B  b C  c. Step 5: Replace S´ ASB and S  ASB by S´ AE, S´ AE, and E  SB.

4 Quiz. Construct a Chomsky normal form for the grammar
S  aTbb | U | L T  cT | c U  Ud | d. Solution. Step 1: No change. Step 2: No change. Step 3: Remove unit production S  U to obtain S  aTbb | Ud | d | L Step 4: Transform right sides of length at least two into strings of nonterminals. S  ATBB | UD | d | L T  CT | c U  UD | d. A  a B  b C  c D  d. Step 5: Replace S  ATBB with the productions S  AE, E  TF, F  BB.

5 Greibach Normal Form. Productions have one of the following forms
A  b (b a terminal) A  bD1…Dk S  L (if L is in the language). Advantage: Any string of length n > 0 can be derived in n steps. Algorithm (idea). Transform context-free grammar to Greibach normal form. Perform steps 1, 2, and 3 of the Chomsky algorithm. Remove all left-recursion, including indirect, without adding L 3. Make substitutions to transform the grammar into the proper form. Example. Put the following grammar into Greibach normal form. S  AB | Ac | d A  aA | a B Ab | c. Solution: Steps 1 (Chomsky steps 1, 2, and 3) and 2 are non needed. Step 3: Replace A in S  AB | Ac | d with aA | a to obtain S  aAB | aB | aAc | ac | d. Replace A in B Ab | c with theright side of A  aA | a to obtain B aAb | ab | c. Now add the new productions C  c and D  b to obtain the proper form: S  aAB | aB | aAC | aC | d A  aA | a B aAD | aD | c C  c D  b.

6 Example. Put the following grammar into Greibach normal form.
S  AB | Ac | d A  aA | a B Ab | c. Solution: Steps 1 (Chomsky steps 1, 2, and 3) and 2 are not needed. Step 3: Replace A in S  AB | Ac | d with aA | a to obtain S  aAB | aB | aAc | ac | d. Replace A in B Ab | c with theright side of A  aA | a to obtain B aAb | ab | c. Now add the new productions C  c and D  b and make appropriate replacements to obtain the proper form: S  aAB | aB | aAC | aC | d B aAD | aD | c C  c D  b.

7 Properties of Context-Free Languages
When we know some properties of context-free languages they can help us argue, BWOC, that certain languages are not context-free. The Pumping Lemma If L is an infinite context-free language, then any grammar for L must be recursive, so there must be derivations of the the following form where u, v, w, x, and y are terminal strings. S + uNy N + vNx (where v and x are not both L) N + w. These derivations lead to derivations like S + uNy + uvNxy + uv2Nx2y + uvkNxky + uvkwxky  L for all k  N. This is the basis for the Pumping Lemma: There is an integer m > 0 such that if z  L and | z | ≥ m, then z has the form z = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky  L for all k  N. Note: The number m depends on the grammar as we’ll see in the following example.

8 Example. Suppose we have the following
grammar for {L, bbc}  {abcnd | n  N}. S  aNd | bbc | L N Nc | b. Here are a few derivations: S  aNd  abd S  aNd  aNcd  abcd S  aNd  aNcd  aNccd  abccd S + abckd for any k in N. For this grammar m = 4 can be used in the pumping lemma because any derivation of a string z with | z | ≥ 4 must use the nonterminal N. For example, if | z | = 8 and z = abcccccd, then the pumping lemma factors z = abcccccd = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ 4 and uvkwxky  L for all k  N. In this case let u = a, v = L, w = b, x = c, and y = ccccd. Example. The language L = {anbncn+k | k, n  N} is not context-free. Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies. Choose z = ambmcm where m is the positive integer from the lemma. Then z = ambmcm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky  L for all k  N. Observe neither v nor x can contain distinct letters. For example, if v = …a…b…, then v2 = …a…b……a…b…, which can’t appear as a substring of any string in L. So v and x must be strings of repeated occurrences of a single letter. Now since | vwx | ≤ m, there are two possible places in ambmcm where v and x must occur: (1) v and x occur in ambm. (2) v and x occur in bmcm. But we obtain the following contradictions because v and x are not both L. (1) Let k = 2 to obtain uv2wx2y = am+ibm+jcm, where i > 0 or j > 0. So uv2wx2y  L (2) Let k = 0 to obtain uwy = ambm-icm–j, where i > 0 or j > 0. So we have uwy  L. These contradictions imply that L is not context-free. QED.

9 Example/Quiz. Prove that the language L = {ss | s  {a, b}
Example/Quiz. Prove that the language L = {ss | s  {a, b}*} is not context-free. Proof: Assume, BWOC, that L is context-free. L is infinite, so pumping lemma applies. Choose z = ambmambm where m is the positive integer from the lemma. Then z = ambmambm = uvwxy where 1 ≤ | vx | ≤ | vwx | ≤ m and uvkwxky  L for all k  N. Now since | vwx | ≤ m, there are three possible places in ambmambm where v and x must occur: (1) v and x occur in ambm (on the left of z). (2) v and x occur in bmam (in the center of z). (3) v and x occur in ambm (on the right of z). Notice that v and x can consist only of repetitions a single letter. For example, in case (1) suppose v = aibj for some i > 0 and j > 0 and x = bn for some n ≥ 0. Then, letting k = 0, we would obtain uwy = am–ibm–j–nambm, which cannot be in L. The argument is similar for the other cases. So v and x must consist only of repetitions of a single letter. We need to find a contradiction in each of the three cases. We’ll do it by using k = 0. This tells us that uwy  L. But we obtain the following contradictions because v and x are not both L. (1) uwy = am–ibm–jambm where either i > 0 or j > 0 So uwy  L, (2) uwy = ambm–iam–jbm where either i > 0 or j > 0. So uwy  L. (3) uwy = ambmam–ibm–j where either i > 0 or j > 0. So uwy  L. Therefore L is not context-free. QED. Remark: Be careful that the choice of z is not in a context-free sublanguage of L. For example, if we chose z = (ab)m(ab)m in the preceding example, we would not get any contradictions.


Download ppt "Section 12.4 Context-Free Language Topics"

Similar presentations


Ads by Google