CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Lecture 4 Context-free grammars Jan Maluszynski, IDA, ida.liu.se
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Outline (Sipser 2.1 – 2.3) 1.Motivating example 2.CFG Definition 3.Parse trees, ambiguity 4.Push-down automata 5.Equivalence CFG – PDA 6.Pumping lemma for CFG
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Motivating example A (A) A Gramatical rules used for rewriting: A (A) () A (A) ((A)) (((A))) ((())) L(A) : all terminal strings derivable L(A) is not regular!
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Context Free Grammars A CFG : (V, , R, S) V a finite set of nonterminals (variables) a finite set of terminals, disjoint with V R a finite set of rules of the form X w where X V and w (V+ )* S V is a start nonterminal
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Context-Free Languages A CFG G: (V, , R, S) L(G) denotes the language of G: the set of all terminal strings derivable in G from S A language is a context-free language iff it is the language of a CFG
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Ambiguity E E+E E E#E E (E) E a
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Chomsky Normal Form G is in CNF if every rule is of the form A BC or A a In addition, there may be the rule S Where S is the start nonterminal Every CFL is generated by a CNF grammar. An application: the CYK parsing algorithm
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Cocke-Younger- Kasami parsing G CNF grammar w string w = s 1 s 2 …..s n CYK checks if w is in L(G) by constructing a table T[i,j], 1 i,j |w| where: T[i,j] = { X | X => s i s i+1 …..s i+j } Thus w is in L(G) iff S is in T[1,|w|]
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT CYK table Construct T[i,1] for i = 1,…, |w| X is in T[i,1] iff X s i Having T[i,j] and T[i+j,k] construct T[i,j+k]: Y is in T[i,j+k] iff for some X in T[i,j], Z in T[i+j,k] Y X Z
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT CYK (transposed) table example S AB | BC A BA | a B CC | b C AB | a b a a b a i = substrings: 1 B A,C A,C B A,C length 1 2 S,A B S,C S,A length B B length S,A,C length 4 5 S,A,C length 5 e.g. B T[2,3] since B CC, C T[2,1] C T[2+1,2]
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Pushdown Automaton FA extended with a stack Top stack symbol is an argument of a transition can be accessed and replaced by a new one
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT (Nondeterministic) Pushdown Automaton (Q, , , , q 0,F) Q states Input alphabet Stack alphabet : (Q ( { }) ( { })) P (Q ( { }) q 0 Q initial state F Q final states
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT PDA vs. CFG’s Theorem: A language is context-free iff it is recognized by a pushdown automaton. (p.117) CFG PDA : Idea: simulate derivations on the stack (see example) PDA CFG Idea: Nonterminals X pq derive all strings that bring PDA from p to q starting and ending with empty stack (details see pp ).
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Regular Languages are Context-free Every Regular Language is Context-free: FA is a special case of PDA Regular grammar: Each rule of the form: A aB, A a or A A CF language L is regular iff there exists a regular CFG G such that L = L(G).
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Pumping lemma for CFL a + a # a a+…a+ a # a…# a
CS Master – Introduction to the Theory of Computation Jan Maluszynski - HT Pumping Lemma for CFL For any CFL L there is p such that If s L and |s| p then s=uvxyz for some u,v,x,y,z satisfying: For each i 0 uv i xy i z L |vy|>0 |vxy| p Used for proving that a language is not a CFL