Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering.

Similar presentations


Presentation on theme: "1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering."— Presentation transcript:

1 1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering

2 2 Agenda Context Free Grammar (CFG) –Design –Parse Tree Cocke-Younger-Kasami (CYK) algorithm –Parsing CFG in normal form Pushdown Automata (PDA) –Design

3 3 Context-Free Grammar (Recap) A context free grammar is consisted of S  AB | ba A  aA | a B  b 4) Start Variable 3) Production Rule Another Production Rule 2) Terminal 1) Variable

4 4 Context-Free Grammar (Recap) A string is said to belong to the language (of the CFG) if it can be derived from the start variable S  AB | ba A  aA | a B  b  AB  aAB  aaB  aab CFG Example S Derivation Therefore, aab belongs to the language = Apply Production Rule

5 5 Why CFG? L = {w = 0 n 1 n : n is an positive integer} L is not a regular language –Proved by “Pumping Lemma” A Context-Free Grammar can describe it Thus, CFG is more general than regular expression –NFA  Regular Expression  DFA S  0S1 S  01

6 6 CFG Design Given a context-free language, design the CFG L = { ab-string, w : Number of a’s < Number of b’s } Some time for you to get into think… 1 min S  ? …

7 7 CFG Design (Con’t) Trial: Bottom-up –Shortest string in L : “b” –Given a string in L, we can expand it, s.t. it is still in L –i.e., Add terminals, while not violating the constraints

8 8 CFG Design (Con’t) One Wrong Trial: S  b S  bS | Sb S  abS | baS | bSa | aSb After adding 1 “b”, number of “b” is still greater than that of “a” Adding 1 “a” and 1 “b”, the difference between the numbers of “a” and “b” keep constant However, cannot parse strings like “aabbbbbaa”

9 9 CFG Design (Con’t) Approach 1: S  b S  SS S  SaS | aSS | SSa Base Case #b still > #a : #b ≥ #a + 1 : #a = 1  #b ≥ #a + 2 - 1 1 st S 2 nd S That a But, is it sufficient to say the grammar is correct?

10 10 CFG Design (Con’t) Approach 2: Start with the grammar for ab-strings with same number of a’s and b’s Call the start symbol of this grammar E Now, we generate all strings of type EbE | EbEbE | EbEbEbE | … Thus, we have the grammar…

11 11 CFG Design (Con’t) Approach 2 (Con’t): S  EbET T  bET | ε E  … For the pattern : EbE | EbEbE | … E generates ab-strings with same number of a’s and b’s (c.f. “09L7.pdf” – Slide #32)

12 12 CFG Design (Con’t) After designing the grammar, G, you may have to prove (if required) that the language of this grammar is equivalent to the given language i.e., Prove that L(G) = L Proof Part 1) L(G) ⊂ L Part 2) L ⊂ L(G) Due to time limit, I will not do this part

13 13 Parse Tree How to parse “aab” in this grammar? (Previous example) S  AB | ba A  aA | a B  b CFG Example S  AB  aAB  aaB  aab Derivation

14 14 Parse Tree (Con’t) Idea: Production Rule = Node + Children Should be very intuitive to understand  AB  aAB  aaB  aab S Derivation S B b a A a A

15 15 Parse Tree (Con’t) Ambiguity: S S 2 1 A S S - 3 - S S 3 2 S S S - 1 -  S - S  1 | 2 | 3 SSSS 3 - 1 - 2String: CFG: 3 – 1 – 23 – (1 – 2)

16 16 Parse Tree (Con’t) Useful in programming language –CSC3180 Useful in compiler –CSC3120

17 17 Cocke-Younger-Kasami Algorithm Used to parse context-free grammar in Chomsky normal form (or simply normal form) S  AB | BC A  BA | a B  CC | b C  AB | a Example Every production is of type 1)X  YZ 2)X  a 3)S  ε Normal Form

18 18 CYK Algorithm - Idea = Algorithm 2 in Lecture Note (09L8.pdf) Idea: Bottom Up Parsing Algorithm: Given a string s of length N For k = 1 to N For every substring of length k Determine what variable(s) can derive it sub(x,y) : starts at index x, ends at index y

19 19 CYK Algorithm - Init Base Case : k = 1 –The possible choices of variable(s) can be known by scanning through each production S  AB | BC A  BA | a B  CC | b C  AB | a We want to parse this string ababa BA,C B

20 20 i.e., “aab” = sub(2,4) 2 3 ab Length of Substring Start Index of Substring BA,C B CYK Algorithm – Table aba Each cell: Variables deriving the substring Substring of length = 3 Starting with index = 2

21 21 Possible: BA | BC  Variable A,S –Since A  BA, S  BC When k = 2 Example –sub(1,2) = “ba” –“ba” = “b” + “a” = sub(1,1) + sub(2,2) CYK Algorithm – Loop (k>1) ababa BA,C B S,A S  AB | BC A  BA | a B  CC | b C  AB | a

22 22 = sub(2,2) + sub(3,4) = sub(2,3) + sub(4,4) S,C A,CB B S,AB Possible: AS, AC, CS, CC, BB For each substring –Decompose into two substrings Example sub(2,4) = “aab” CYK Algorithm – Loop (k>1) ababa S  AB | BC A  BA | a B  CC | b C  AB | a Therefore, B is put into the cell

23 23 CYK Algorithm – Loop (k>1) How about sub(3,5) ? Give you 1 min ababa BA,C B S,ABS,CS,A S  AB | BC A  BA | a B  CC | b C  AB | a

24 24 CYK Algorithm – Parse Tree Parse Tree is known from the table See “09L8.pdf” - Slide #21 Length of Substring Start Index of Substring BA,C B S,ABS,CS,A BB S,A,C ababa S  AB | BC A  BA | a B  CC | b C  AB | a

25 25 CYK Algorithm (Conclusion) Start from shortest substring to the longest –i.e., from single-character-string to the whole string For Context-free grammar, G 1) Convert G into normal form Remove ε-productions Remove unit-productions 2) Apply CYK algorithm Con: Loss in intuition

26 26 End Thanks for coming! =] Any questions?


Download ppt "1 CSC 3130: Automata theory and formal languages Tutorial 4 KN Hung Office: SHB 1026 Department of Computer Science & Engineering."

Similar presentations


Ads by Google