Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples.

Similar presentations


Presentation on theme: "Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples."— Presentation transcript:

1 Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples

2 Context-Free Languages (Ch. 2) Context-free languages allow us to describe non-regular languages like { 0 n 1 n | n  0} General idea: CFL’s are languages that can be recognized by automata that have one stack: { 0 n 1 n | n  0} is a CFL { 0 n 1 n 0 n | n  0} is not a CFL

3 Context-Free Grammars Start symbol S with rewrite rules: 1) S  0S1 2) S   S yields 0 n 1 n : S  0S1  00S11  …  0 n S1 n  0 n 1 n

4 Context-Free Grammars (Def.) A context free grammar G=(V, ,R,S) is defined by V: a finite set variables (non-terminals)  : finite set terminals (with V  =  ) R: finite set of substitution rules V  (V  )* S: start symbol  V The language of grammar G is denoted by L(G): L(G) = { w  * | S  * w }

5 Derivation  * A single step derivation “  ” consist of the substitution of a variable by a string according to a substitution rule. Example: with the rule “A  BB”, we can have the derivation “01AB0  01BBB0”. A sequence of several derivations (or none) is indicated by “  * ” Same example: “0AA  * 0BBBB”

6 Some Remarks The language L(G) = { w  * | S  * w } contains only strings of terminals, not variables. Notation: we summarize several rules, like A  B A  01 by A  B | 01 | AA A  AA Unless stated otherwise: topmost rule has the start variable on the left side.

7 Context-Free Grammars (Ex.) Consider the CFG G=(V, ,R,S) with V = {S}  = {0,1} R: S  0S1 | 0Z1 Z  0Z |  Then L(G) = {0 i 1 j | i  j and j > 0} S yields 0 j+k 1 j according to: S  0S1  …  0 j S1 j  0 j Z1 j  0 j 0Z1 j  …  0 j+k Z1 j  0 j+k  1 j = 0 j+k 1 j

8 Importance of CFL Model for natural languages (Chomsky) Specification of programming languages: “parsing of a computer program” parser for HTML (and some special cases of SGML) Describes mathematical structures. Intermediate between regular languages and other language families of Chomsky hierarchy

9 Set of boolean expressions is a CFL Consider the CFG G=(V, ,R,S) with V = {S}  = {0,1,(,), , ,  } R: S  0 | 1 |  (S) | (S)  (S) | (S)  (S) Some elements of L(G): 0  ((  (0))  (1)) (1)  ((0)  (0)) Note: Parentheses prevent “1  0  0” confusion. This language requires full-parenthesizing.

10 A very small subset of English Rules:   |  |   | …  a | the  boy | girl | house  sees | ignores A string that can be generated by this grammar: the boy sees the girl

11 Parse Trees The parse tree of (0)  ((0)  (1)) via rule S  0 | 1 |  (S) | (S)  (S) | (S)  (S): S ( )  )( S S ( ) )( S S 0 0 1

12 Ambiguity A grammar is ambiguous if some strings are derived ambiguously. A string is derived ambiguously if it has more than one leftmost derivations or more than one parse tree. Typical example: rule S  0 | 1 | S+S | S  S S  S+S  S  S+S  0  S+S  0  1+S  0  1+1 versus S  S  S  0  S  0  S+S  0  1+S  0  1+1

13 Ambiguity and Parse Trees The ambiguity of 0  1+1 is shown by the two different parse trees: S + S  S 1 S 0 S 1 S  S + S 1 S 1 S 0

14 More on Ambiguity The two different derivations: S  S+S  0+S  0+1 and S  S+S  S+1  0+1 do not constitute an ambiguous string 0+1 (they will have the same parse tree) Languages that can only be generated by ambiguous grammars are “inherently ambiguous”

15 Context-Free Languages Any language that can be generated by a context free grammar is a context-free language (CFL). The CFL { 0 n 1 n | n  0 } shows us that certain CFLs are nonregular languages. Q1: Are all regular languages context free? Q2: Which languages are outside the class CFL?

16 Example. A context-free grammar for the set of strings over {0,1} with an equal number of 0’s and 1’s. We need a grammar for the language L = { w | w has an equal number of 0’s and 1’s} Consider the grammar: S  0 S 1 | 1 S 0 |  This does not cover all the strings. Exhibit a string in L that is not generated by G.

17 We need to add the rule S  SS The complete grammar is: S  0S1 | 1S0 |  | SS How do we get a string like 011010? S  SS  S S S  0S1SS  01SS  011S0S  0110S  01101S0  011010 How do we show that L = L(G)? We need the following Lemma: Let x be a string in L. Then, (exactly) one of the following holds: (a) x = 0 y 1 for some y in L, (b) x = 1 y 0 for some y in L, or (c) x = yz for some y and z (both of them non-null) such that both y and z are in L.

18 Chomsky Normal Form A context-free grammar G = (V, ,R,S) is in Chomsky normal form if every rule is of the form A  BC orA  x with variables A  V and B,C  V \{S}, and x   For the start variable S we also allow the rule S   Advantage: Grammars in this form are far easier to analyze.

19 Theorem 2.6 Every context-free language can be described by a grammar in Chomsky normal form. Outline of Proof: We rewrite every CFG in Chomsky normal form. We do this by replacing, one-by-one, every rule that is not in Chomsky Normal Form (CNF). We have to take care of: Starting Symbol,  symbol, all other violating rules.

20 Proof Theorem 2.6 Given a context-free grammar G = (V, ,R,S), rewrite it to Chomsky Normal Form by 1) New start symbol S 0 (and add rule S 0  S) 2) Remove A  rules (from the tail): before: B  xAy and A , after: B  xAy | xy 3) Remove unit rules A  B (by the head): “A  B” and “B  xCy”, becomes “A  xCy” and “B  xCy” 4) Shorten all rules to two: before: “A  B 1 B 2 …B k ”, after: A  B 1 A 1, A 1  B 2 A 2,…, A k-2  B k-1 B k 5) Replace ill-placed terminals “a” by T a with T a  a

21 Careful Removing of Rules Do not introduce new rules that you removed earlier. Example: A  A simply disappears When removing A  rules, insert all new replacements: B  AaA becomes B  AaA | aA | Aa | a

22 Example of CNF Initial grammar: S  aSb |  In Chomsky normal form: S 0   | T a T b | T a X X  ST b S  T a T b | T a X T a  a T b  b

23 RL  CFL Every regular language can be expressed by a context-free grammar. Proof Idea: Given a DFA M = (Q, , ,q 0,F), we construct a corresponding CF grammar G M = (V, ,R,S) with V = Q and S = q 0 Rules of G M : q i  x  (q i,x) for all q i  V and all x  q i   for all q i  F

24 Example RL  CFL q1q1 q2q2 q3q3 10 01 0,1 The DFA leads to the context-free grammar G M = (Q, ,R,q 1 ) with the rules q 1  0q 1 | 1q 2 q 2  0q 3 | 1q 2 |  q 3  0q 2 | 1q 2

25 Picture Thus Far Regular languages context-free languages ?? { 0 n 1 n }


Download ppt "Lecture 16 Oct 18 Context-Free Languages (CFL) - basic definitions Examples."

Similar presentations


Ads by Google