Presentation is loading. Please wait.

Presentation is loading. Please wait.

7. Properties of Context-Free Languages

Similar presentations


Presentation on theme: "7. Properties of Context-Free Languages"— Presentation transcript:

1 7. Properties of Context-Free Languages
CIS Automata and Formal Languages – Pei Wang

2 Chomsky normal form A CFL can be generated by many CFGs Every CFL  {ɛ} can be generated by a CFG in Chomsky normal form (CNF), where each rule is in the form of A → BC or A → a, i.e., every variable becomes either two variables or one terminal Every CFG can be converted into CNF in several steps

3 Removing ɛ-productions
A symbol A is nullable if A * ɛ, i.e., there is a production A → ɛ, or A → B1B2 … Bk where B1B2 … Bk are all nullable If A is nullable, then B → CAD should produce a variant B → CD, and A cannot derive ɛ anymore in B → CAD All the ɛ-productions can be eliminated by treating all the variables the above way

4 Removing ɛ-productions: example
S → AB A → aAA | ɛ B → bBB | ɛ S, A, and B are all nullable. New grammar: S → AB | A | B A → aAA | aA | a B → bBB | bB | b

5 Removing unit productions
A unit production has the form A → B, and (A, B) is a unit pair if A * B A unit pair can be removed by expanding the involved variables all the way until the result is not a unit production If there is a cycle of expansion like A → B → C →  → A then all the variables involved can be merged

6 Removing unit productions: example
I → a | b | Ia | Ib | I0 | I1 F → I | (E) T → F | T * F E → T | E + T changes to F → a | b | Ia | Ib | I0 | I1 | (E) T → a | b | Ia | Ib | I0 | I1 | (E) | T * F E → a | b | Ia | Ib | I0 | I1 | (E) | T * F | E + T

7 Removing useless symbols
A symbol X is useful if it is both reachable and generating, i.e., S * αXβ * w Removing a useless symbol in a grammar will not change the language it generates Eliminate nongenerating symbols and all productions involving such symbols Eliminate unreachable symbols The order of the above steps matters

8 Useless symbols: example
Given CFG: S → AB | a A → b B is not generating, so the grammar is S → a Now A is not reachable, so the grammar is

9 CFG to Chomsky normal form
Convert a CFG into CNF (not unique): Eliminate ɛ-productions Eliminate unit productions Eliminate useless symbols Change non-CNF productions into CNF productions, i.e., A → BCD becomes A → BE, E → CD A → Fg becomes A → FG, G → g

10 Greibach normal form Every nonempty CFL without ɛ can be generated from a grammar each of whose production rule has the form A → aα where a is a terminal, and α is a string of zero or more variables This form can be obtained from PDA with a single state and accept by empty stack

11 Pumping lemma for CFL A sufficiently long string must be derived by using the same variable repeatedly in a path of the parse tree

12 Pumping lemma for CFL (cont)
A part of the parse tree can be repeated: S * uAy A * vAx A * w

13 Languages that are not CFL
The pumping lemma can be used to show that some languages are not CFL: L = {0m1m2m | m 1} : for the n in pumping lemma, pick the word z = 0n1n2n = uvwxy, since there are n 1’s in the middle, vwx cannot contains both 0 and 2, so repeat it will produce a word not in the language To prove L = {ww} is not CFL, pump the word 0n1n0n1n , then discuss the cases

14 Closure properties of CFL
CFLs are closed under the operation of Substitution (replace a terminal by a CFL) Union Concatenation Closure (* and +) Reversal Homomorphism Inverse homomorphism

15 Closure properties of CFL (cont.)
CFL’s are not closed under complement, intersection, and difference Example: {0n1n2i | n 1, i 1} and {0i1n2n | n 1, i 1} are both CFL’s, but their intersection is not Example: {0,1}*  {ww} is CFL, but {ww} is not The intersection or difference of a CFL and a regular language is a CFL

16 Decision properties of CFL’s
[Complexity-related topics will not be covered] Whether a CFL is empty can be decided by checking whether the start symbol of its grammar is generating Whether a string belongs to a CFL can be decided using dynamic programming to incrementally build up the string

17 Testing membership in a CFL
The CYK algorithm: use a CFG in CNF to incrementally find all variables that generate the substrings The triangular table is filled bottom-up, where Xij comes from XikX(k+1)j for all possible k values, according to the grammar

18 Membership decision for CFL


Download ppt "7. Properties of Context-Free Languages"

Similar presentations


Ads by Google