# 4.5 Inherently Ambiguous Context-free Language For some context-free languages, such as arithmetic expressions, may have many different CFG’s to generate.

## Presentation on theme: "4.5 Inherently Ambiguous Context-free Language For some context-free languages, such as arithmetic expressions, may have many different CFG’s to generate."— Presentation transcript:

4.5 Inherently Ambiguous Context-free Language For some context-free languages, such as arithmetic expressions, may have many different CFG’s to generate the languages. Some of the CFG’s are ambiguous, but some are not. We need to show that there are infinitely many strings of the form a n b n c n d n, n ≧ 1, that have two distinct leftmost derivations. In this section, we show that the language L = {a n b n c m d m | n ≧ 1, m ≧ 1} ∪ {a n b m c m d n | n ≧ 1, m ≧ 1} is inherently ambiguous, i.e., there is no unambiguous CFG to generate the language L.

Lemma 2 Let (N i, M i ), 1 ≦ i ≦ r, be pairs of sets of integers. (The sets may be finite or infinte.) Let S i = {(n, m) | n in N i, m in M i } and let S = S 1 ∪ S 2 ∪ … ∪ S r. If each pair of integers (n, m) is in S for all n and m, where n≠m, then (n, n) is in S for all but some finite set of n. Proof : By contradiction. Suppose that there are infinitely many n that (n, n) is not in S. Let T ={n| (n, n) is not in S}. It is obvious that T is infinite.

S r is a subset of S  There are infinitely many n that (n, n) is not in S r.  There are infinite many n that are not in N r, or not in M r.  T\N r is infinite or T \M r is infinite. Say T\N r is infinite. Let T r = T\N r. We have that T r is infinite and for all n and m in T r, (n, m) is not in S r. For all n in T r, (n, n) is not in S r-1.  For all n in T r, n is not in N r-1 or not in M r-1.  T r \N r is infinite or T r \M r is infinite. Say T r \N r is infinite. Let T r-1 = T r \N r. We have that T r-1 is infinite and for all n and m in T r-1, (n, m) is not in S r-1 ∪ S r.

By the same argument, we have that T 1 is infinite and for all n and m in T 1, (n, m) is not in S 1 ∪ S 2 ∪ … ∪ S r.  There exist n≠m in T 1, but (n, m) is not in S. Contradiction.

Lemma 3 Let G be an unambiguous CFG. Then we can construct an unambiguous CFG G’ equivalent to G, such that G’ has no useless symbols or productions, and for every variable A in G’, A≠S and S is the start symbol of G’, we have that A =>* x 1 Ax 2, where x 1 and x 2 are in T* and not both ε.

Theorem 8 The CFL, L = {a n b n c m d m | n ≧ 1, m ≧ 1} ∪ {a n b m c n d m | n ≧ 1, m ≧ 1}, is inherently ambiguous. Proof By contradiction. Assume that L can be generated by an unambiguous CFG. By Lemma 3, there is an unambiguous CFG G={V, T, P, S} generating L without useless symbols and productions; and for each A in V\{S}, A=>* x 1 Ax 2 for some x 1 and x 2 in T*, not both ε.

A=>* x1Ax2 for some x1 and x2 in T*, not both ε. We must have the following: x1 and x2 each consists of only one type of symbols a, b, c and d. |x1| = |x2| If A=>* x1Ax2 and A=>* x3Ax4, then x1 and x3 consist of the same type of symbol, same as x2 and x4. 1.x1 consists of solely of a’s, and x2 solely of b’s or of d’s, 2.x1 consists of solely of b’s, and x2 solely of c’s, or 3.x1 consists of solely of c’s, and x2 solely of d’s.

Variables other than S in V can be divided into four classes, C ab, C ad, C bc, and C cd. A in C ab, A=>x1Ax2, where x1 in a* and x2 in b*. For variables in C ad, C bc, and C cd are similar. A derivation containing a symbol in C ab or C cd can not contain a symbol in C ad or C bc or vice versa.

Divide G into two grammars, G1 = ({S} ∪ Cab ∪ Ccd, T, P1, S) and G2 = ({S} ∪ C ad ∪ C bc, T, P2, S), P1 contains all productions of P in which each production has only one variable in C ab or C cd. P2 contains all productions of P in which each production has only one variable in C ad or C bc. P1 contains all productions from P of the form S→a n b n c m d m, n≠m. P2 contains all productions from P of the form S→a n b m c m d n, n≠m. Productions of the form S→a n b n c n d n in P are not in either P1 or P2.

Consider G1, number the productions in P1 of the form S→α from 1 to r. If S →α is the ith production, 1 ≦ i ≦ r, let Ni be the set of all n such that S=> G1 α=>* G1 a n b n c m d m for some m, and let Mi be the set of all m such that S=> G1 α=>* G1 a n b n c m d m for some n, and For all n in Ni and m in Mi, We have that S=> G1 α=>* G1 a n b n c m d m. By lemma 2, we have that G1 must generate all but finite number of sentences in { a n b n c n d n | n ≧ 1}.

Apply the same agument, G2 must generate all but finite number of sentences in { a n b n c n d n | n ≧ 1}. Therefore, L(G1) ∩ L(G2) contain infinite number of sentences in { a n b n c n d n | n ≧ 1}. Those sentences are generated by two distinct derivations in G. Contradict to the assumption that G is unambiguous.

Download ppt "4.5 Inherently Ambiguous Context-free Language For some context-free languages, such as arithmetic expressions, may have many different CFG’s to generate."

Similar presentations