Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 09: Theory of Automata:08 Context Free Grammars.

Similar presentations


Presentation on theme: "Lecture 09: Theory of Automata:08 Context Free Grammars."— Presentation transcript:

1 Lecture 09: Theory of Automata:08 Context Free Grammars

2 Lecture 09: Theory of Automata:08 2 The Total Language Tree It is possible to depict the generation of all the words in the language of a CFG simultaneously in one big (possibly infinite) tree. Definition: For a given CFG, we define a tree with the start symbol S as its root and whose nodes are working strings of terminals and nonterminals. The descendants of each node are all the possible results of applying every applicable production to the working string, one at a time. A string of all terminals is a terminal node in the tree. The resultant tree is called the total language tree of the CFG.

3 Lecture 09: Theory of Automata:08 3 Example Consider the CFG: S → aa | bX |aXX X → ab |b The total language tree is

4 Lecture 09: Theory of Automata:08 4 The above total language has only 7 different words. Four of its words (abb, aabb, abab, aabab) have two different derivations because they appear as terminal nodes in two different places. However, these words are NOT generated by two different derivation trees. Hence, the CFG is unambiguous. For example,

5 Lecture 09: Theory of Automata:08 5 Example Consider the CFG: S → aSb | bS | a The language of this CFG is infinite, so is the total language tree: The tree may get arbitrary wide as well as infinitely long. Can you draw the beginning part of this total language tree?

6 Lecture 09: Theory of Automata:08 6 Semi Word For a given CFG, semiword is a string of terminals (may be none) concatenated with with exactly one non-terminal (on the right). In general semiword has the shape (terminal) (terminal)….(terminal) (Non-Terminal) e.g. aaaX abcY bbY A word is a string of terminals only (zero or more terminals)

7 Lecture 09: Theory of Automata:08 7 Regular Grammar Given an FA, there is a CFG that generates exactly the language accepted by the FA. –In other words, all regular languages are CFLs CFL Regular

8 Lecture 09: Theory of Automata:08 8 Creating a CFG from an FA Step-1 The Non-terminals in CFG will be all names of the states in the FA with the start state renamed S. Step-2 For every edge Create productions X  aY or X  aX Do the same for b-edges Step-3 For every final-state X, create the production X  Λ Xy a x a

9 Lecture 09: Theory of Automata:08 9 Example S  aM S  bS M  aF M  bS F  aF F  bF F  Λ S-S- M a b b x a,b a Note: It is not necessary that each CFG has a corresponding FA. But each FA has an equivalent CFG.

10 Lecture 09: Theory of Automata:08 10 Regular Grammar Theorem 22: If all the productions in a given CFG fit one of the two forms: Non-terminal  semiword or Non-terminal  word (Where the word may be a Λ or string of terminal), then the language generated by the CFG is Regular. Proof: For a CFG to be regular is by constructing a TG from the given CFG.

11 Lecture 09: Theory of Automata:08 11 Proof contd. Let us consider a general CFG in this form N 1  w 1 N 2 N 7  w 10 N 1  w 2 N 3 N 18  w 23 N 2  w 3 N Where N’s are non-terminal and w’s are the string of terminal and part w y N z are semiwords. Let N 1 =S. Draw a small circle for each N and one extra circle labelled +, the circle for S we label (-) -S N3N3 N2N2 NxNx N 13 + ……

12 Lecture 09: Theory of Automata:08 12 Proof contd. For each production of the form N x  w y N z, draw a directed edge from state N x to N z with label w y. If Nx = Nz, the path is a loop For every production of the form N p  w q, draw a directed edge from Np to + and label it with w q even if w q = Λ. Any path in TG form – to + corresponds to a word in the language of TG (by concatenating symbols) and simultaneously corresponds to sequence of productions on the CFG generating words. Conversely every production of the word in the CFG: S  wN  wwN  wwwN  …..  wwwww Corresponds to a path in this TG. NxNx NzNz wywy NpNp + wqwq

13 Lecture 09: Theory of Automata:08 13 Example Consider the CFG S  aaS | bbS | Λ The regular expression is given by (aa + bb)*. Consider the CFG S  aaS | bbS | abX | baX | Λ X  aaX | bbX | abS | baS - aa + bb Λ - aa,bb X Λ + ab, ba aa,bb Language accepted? EVEN-EVEN

14 Lecture 09: Theory of Automata:08 14 Killing Λ-Productions Λ-Productions: In a given CFG, we call a non-terminal N null able if there is a production N  Λ, or there is a derivation that starts at N and lead to a Λ. N  ………  Λ Λ-Productions are undesirable. We can replace Λ-production with appropriate non-Λ productions.

15 Lecture 09: Theory of Automata:08 15 Theorem 23 If L is CFL generated by a CFG having Λ-productions, then there is a different CFG that has no Λ-production and still generates either the whole language L (if L does not include Λ) or else generate the language of all the words in L other than Λ. Replacement Rule. 1.Delete all Λ-Productions. 2.Add the following productions: For every production of the X  old string Add new production of the form X .., where right side will account for every modification of the old string that can be formed by deleting all possible subsets of null-able Non-Terminals, except that we do not allow X  Λ, to be formed if all the character in old string are null-able

16 Lecture 09: Theory of Automata:08 16 Example Old nullableNew Production X  Ynothing X  Λnothing Y  Xnothing S  XbS  b S  aYa S  aa So the new CFG is S  a | Xb | aa | aYa |b X  Y Y  b | X Consider the CFG S  a | Xb | aYa X  Y | Λ Y  b | X

17 Lecture 09: Theory of Automata:08 17 Example Old nullableNew Production S  XaS  a X  aXX  a X  bXX  b So the new CFG is S  a | Xa X  aX | bX | a | b Consider the CFG S  Xa X  aX | bX | Λ

18 Lecture 09: Theory of Automata:08 18 Example S  XY X  Zb Y  bW Z  AB W  Z A  aA | bA | Λ B  Ba | Bb | Λ Null-able Non-terminals are? A, B, Z and W

19 Lecture 09: Theory of Automata:08 19 Example Contd. Old nullable New Production X  Zb X  b Y  bWY  b Z  AB Z  A and Z  B W  ZNothing new A  aAA  a A  bAA  b B  BaB  a B  BbB  b So the new CFG is S  XY X  Zb | b Y  bW | b Z  AB | A | B W  Z A  aA | bA | a | b B  Ba | Ba | a | b S  XY X  Zb Y  bW Z  AB W  Z A  aA | bA | Λ B  Ba | Bb | Λ

20 Lecture 09: Theory of Automata:08 20 Killing unit-productions Definition: A production of the form Nonterminal  one Nonterminal is called a unit production. The following theorem allows us to get rid of unit productions: Theorem 24: If there is a CFG for the language L that has no Λ-productions, then there is also a CFG for L with no Λ-productions and no unit productions.

21 Lecture 09: Theory of Automata:08 21 Proof of Theorem 24 This is another proof by constructive algorithm. Algorithm: For every pair of nonterminals A and B, if the CFG has a unit production A  B, or if there is a chain A  X 1  X 2  …  B where X 1, X 2,... are nonterminals, create new productions as follows: If the non-unit productions from B are B  s 1 | s 2 | … where s 1, s 2,... are strings, we create the productions A  s 1 | s 2 | …

22 Lecture 09: Theory of Automata:08 22 Example Consider the CFG S  A| bb A  B | b B  S | a The non-unit productions are S  bb, A  b,B  a And unit productions are S  A A  B B  S

23 Lecture 09: Theory of Automata:08 23 Example contd. Let’s list all unit productions and their sequences and create new productions: S  A givesS  b S  A  B gives S  a A  B gives A  a A  B  S gives A  bb B  S gives B  bb B  S  A gives B  b Eliminating all unit productions, the new CFG is S  bb | b | a A  b | a | bb B  a | bb | b This CFG generates a finite language since there are no nonterminals in any strings produced from S.

24 Lecture 09: Theory of Automata:08 24 Useless Symbols Let a CFG G. A symbol X ε (V U ∑) is useful if there is a derivation Where U and V ε (V U ∑) and w ε ∑*. A symbol that is not useful is useless A terminal is useful if it occurs in a string of the language of G. A variable is useful if it occurs in a derivation that begins from S and generates a terminal string For a variable to be useful two conditions must be satisfied. 1.The variable must occur in a sentential form of the grammar 2.There must be a derivation of a terminal string from the variable. A variable that occurs in a sentential form is said to be reachable from S. A two part procedure is presented to eliminate useless symbols.

25 Lecture 09: Theory of Automata:08 25 Algorithm to remove useless symbols PART-I Identify variables that derive terminal strings Remove non-terminals that do not derive terminal strings. e.g. following grammar G = S  aS | A | C A  a B  aa C  CB

26 Lecture 09: Theory of Automata:08 26 Example S  aS | A | C A  a B  aa C  CB We can identify variables that derive terminal strings. i.e. A  a B  aa And S  A  aTERM = {S, A, B} But not C. thus C is useless

27 Lecture 09: Theory of Automata:08 27 Example PART - II Rename the grammar GT = S  aS | A A  a B  aa Now draw a graph and delete nodes not reachable from S. As B is unreachable so delete it and the final grammar will be G V = S  aS | A A  a SA a B aa

28 Lecture 09: Theory of Automata:08 28 Useless Productions Some derivations never terminate... Useless Production

29 Lecture 09: Theory of Automata:08 29 Another grammar: Not reachable from S Useless Production

30 Lecture 09: Theory of Automata:08 30 In general: if then variable is useful otherwise, variable is useless contains only terminals

31 Lecture 09: Theory of Automata:08 31 A production is useless if any of its variables is useless Productions useless Variables useless

32 Lecture 09: Theory of Automata:08 32 Removing Useless Productions Example Grammar:

33 Lecture 09: Theory of Automata:08 33 First: find all variables that can produce strings with only terminals Round 1: Round 2:

34 Lecture 09: Theory of Automata:08 34 Keep only the variables that produce terminal symbols: (the rest variables are useless) Remove useless productions

35 Lecture 09: Theory of Automata:08 35 Second: Find all variables reachable from Use a Dependency Graph not reachable

36 Lecture 09: Theory of Automata:08 36 Keep only the variables reachable from S Final Grammar (the rest variables are useless) Remove useless productions

37 Lecture 09: Theory of Automata:08 37 Set of variables that Derive terminal symbols Input = CFG (V, ∑, P, S) TERM = { A | there is a rule A  w ε P with w ε ∑* repeat –PREV = TERM –For each variable in A ε V do If there is a rule A  w and w ε (PREV U ∑)* then TERM = TERM U {A} Until PREV = TERM

38 Lecture 09: Theory of Automata:08 38 Example Consider following CFG G:S  AC | BS | B A  aA | aF B  CF | b C  cC | D D  aD | BD | C E  aA | BSA F  bB | b

39 Lecture 09: Theory of Automata:08 39 S  AC | BS | B A  aA | aF B  CF | b C  cC | D D  aD | BD | C E  aA | BSA F  bB | b {B, F, A, S, E} 3 {B, F, A, S}{B, F, A, S, E}2 {B, F}{B, F, A, S}1 {}{B, F}0 PREVTERMIteration New Grammar from TERM will be G T : S  BS | B A  aA | aF B  b E  aA | BSA F  bB | b

40 Lecture 09: Theory of Automata:08 40 Construction of set of reachable Variables Input = CFG (V, ∑, P, S) REACH = {S} 1.PREV = null 2.repeat i.NEW = REACH – PREV ii.PREV = REACH iii.For each variable A in NEW do i.For each rule A  w do add all variables in w to REACH 3.Until REACH = PREV

41 Lecture 09: Theory of Automata:08 41 {S, B} 3 2 {S}{S, B}1 {}{S}0 PREVREACHIteration G T : S  BS | B A  aA | aF B  b E  aA | BSA F  bB | b NEW {S} {B}

42 Lecture 09: Theory of Automata:08 42 Removing All Step 1: Remove Nullable Variables Step 2: Remove Unit-Productions Step 3: Remove Useless Variables


Download ppt "Lecture 09: Theory of Automata:08 Context Free Grammars."

Similar presentations


Ads by Google