Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI 3130: Formal Languages and Automata Theory Tutorial 5

Similar presentations


Presentation on theme: "CSCI 3130: Formal Languages and Automata Theory Tutorial 5"— Presentation transcript:

1 CSCI 3130: Formal Languages and Automata Theory Tutorial 5
Hung Chun Ho Office: SHB 1026 Department of Computer Science & Engineering 1

2 Agenda Cocke-Younger-Kasami (CYK) algorithm Pushdown Automata (PDA)
Parsing CFG in normal form Pushdown Automata (PDA) Design 2

3 Bottom-up Parsing for normal form
CYK Algorithm Bottom-up Parsing for normal form 3

4 Cocke-Younger-Kasami Algorithm
Used to parse context-free grammar in Chomsky normal form (or simply normal form) Every production is of type X  YZ X  a S  ε Normal Form Example S  AB A  CC | a | c B  BC | b C  CB | BA | c 4

5 CYK Algorithm - Idea = Algorithm 2 in Lecture Note (10L8.pdf)
Idea: Bottom Up Parsing Algorithm: Given a string s of length N For k = 1 to N For every substring of length k Determine what variable(s) can derive it 5

6 CYK Algorithm - Example
CFG Parse abbc S  AB A  CC | a | c B  BC | b C  CB | BA | c 6

7 CYK Algorithm – Idea (1) Idea: We parse the strings in this order:
Length-1 substring abbc 7

8 CYK Algorithm – Idea (1) Idea: We parse the strings in this order:
Length-2 substring abbc 8

9 CYK Algorithm – Idea (1) Idea: We parse the strings in this order:
Length-3 substring abbc Length-4 substring Done! 9

10 CYK Algorithm – Idea (2) Idea: Parsing of longer substrings depends on parsing of shorter substrings Example: abb may be decomposed as ab + b a + bb If we know how to parse ab and b (or, a and bb) then we know how to parse abb 10

11 CYK Algorithm – Substring
Denote sub(i, j) := substring with start index = i and end index = j Example: For abbc, sub(2,4) = bbc This notation is not to complicate things, but just for the sake of convenience in the following discussion… 11

12 CYK Algorithm – Table Each cell corresponds to a substring
Store variables deriving the substring Substring of length = 3 Starting with index = 2 i.e., sub(2,3) = bbc Length of Substring a b b c 12 Start Index of Substring

13 CYK Algorithm – Simulation
Base Case : length = 1 The possible choices of variable(s) can be known by scanning through each production S  AB A  CC | a | c B  BC | b C  CB | BA | c A B B A , C a b c 13

14 CYK Algorithm – Simulation
Loop : length = 2 For each substring of length 2 Decompose into shorter substrings Check cells below it A B A, C S  AB A  CC | a | c B  BC | b C  CB | BA | c ab Let’s parse this substring a b c 14

15 CYK Algorithm – Simulation
For sub(1,2) = ab, it can be decomposed: ab = a + b = sub(1,1) + sub(2,2) Possible choices: AB Scan rules A B A, C : S S  AB A  CC | a | c B  BC | b C  CB | BA | c S a b c 15

16 CYK Algorithm – Simulation
For sub(2,3) = bb, it can be decomposed: bb = b + b = sub(2,2) + sub(3,3) Possible choices: BB Scan rules No suitable rules are found  The CFG cannot parse this substring S  A B A, C : ∅ S  AB A  CC | a | c B  BC | b C  CB | BA | c a b c 16

17 CYK Algorithm – Simulation
For sub(3,4) = bc, it can be decomposed: bc = b + c = sub(3,3) + sub(4,4) Possible choices: BA, BC Scan rules S  ∅  A B A, C : B, C S  AB A  CC | a | c B  BC | b C  CB | BA | c B, C a b c 17

18 CYK Algorithm – Simulation
For sub(1,3) = abb: abb = ab + b = sub(1,2) + sub(3,3) Possible choices: SB Scan rules No suitable variables found yet But, there is another way to decompose the string S  ∅  B, C  A B A, C : ∅ S  AB A  CC | a | c B  BC | b C  CB | BA | c a b c 18

19 CYK Algorithm – Simulation
For sub(1,3) = abb: abb = a + bb = sub(1,1) + sub(2,3) Possible choices: ∅ Scan rules Cant parse smaller substring  Cant parse the string  No need to scan rules S  ∅  B, C  A B A, C S  AB A  CC | a | c B  BC | b C  CB | BA | c a b c 19

20 CYK Algorithm – Simulation
For sub(1,3) = abb: abb = sub(1,1) + sub(2,3) gives no valid parsing abb = sub(1,2) + sub(3,3) gives no valid parsing Cannot parse S  ∅  B, C A B A, C S  AB A  CC | a | c B  BC | b C  CB | BA | c a b c 20

21 CYK Algorithm – Simulation
For sub(2,4) = bbc: bbc = sub(2,2) + sub(3,4) Possible choices: BB, BC bbc = sub(2,3) + sub(4,4) Possible choices: ∅  Variable: B ∅  S  B, C  A B A, C S  AB A  CC | a | c B  BC | b C  CB | BA | c B a b c 21

22 CYK Algorithm – Simulation
Finally, for sub(1,4) = abbc: Possible choices: Variables: This cell represents the original string, and it consists S  abbc is in the language AB , SB, SC ∅  B  S  B, C  A B A, C S S  AB A  CC | a | c B  BC | b C  CB | BA | c a b c 22

23 CYK Algorithm – Parse Tree
abbc is in the language! How to obtain the parse tree? Tracing back the derivations: sub(1,4) is derived using SAB from sub(1,1) and sub(2,4) sub(1,1) is derived using Aa sub(2,4) is derived using BBC from sub(2,2) and sub(3,4) So, record also the used derivations! 23

24 CYK Algorithm – Parse Tree
Obtained from the table S  ∅  B  B, C  A B A, C a b c 24

25 CYK Algorithm – Conclusion
A bottom up parsing algorithm Dynamic Programming Solution of a subproblem (parsing of a substring) depends on that of smaller subproblems Before employing CYK Algorithm, convert the grammar into normal form Remove ε-productions Remove unit-productions 25

26 CYK Algorithm – Detailed
D = “On input w = w1w2…wn: If w = ε, and S  ε is rule, Accept For i = 1 to n: For each variable A: Test whether A  b is a rule, where b = wi. If so, place A in table(i, i). For l = 2 to n: For i = 1 to n – l + 1: Let j = i + l – 1, For k = i to j – 1: For each rule A  BC: If table(i,k) contains B and table(k+1, j) contains C Put A in table(i, j) If S is in table (1,n), accept. Otherwise, reject.” 26

27 NFA with infinite memory/states
Pushdown Automata NFA with infinite memory/states 27

28 Pushdown Automata PDA ~= NFA, with a stack of memory Transition:
NFA – Depends on input PDA – Depends on input and top of stack Push a symbol to stack Pop a symbol to stack Read a terminal on string Transitions are non-deterministic (possibly ε) 28

29 Pushdown Automata and NFA
Accept: NFA – Go to an Accept state PDA – Go to an Accept state 29

30 PDA – Example 1 Given the following language: Design a PDA for it
L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 30

31 PDA – Example 1 - Idea Idea: The input has two sections First half
All ‘0’s Second half All ‘1’s #‘1 depends on #‘0’ #‘0’ ≤ #‘1’ ≤ #‘0’ × 2 31

32 PDA – Example 1 – Solution
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 32

33 PDA – Example 1 – Explain Solution: Let’s try some string… w = 00111
See white board for simulation… q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 33

34 PDA – Example 1 – Explain Solution: Indicates the start of parsing
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 34

35 PDA – Example 1 – Explain Solution:
This part saves information about #‘0’ # ‘X’ in stack = #‘0’ q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 35

36 PDA – Example 1 – Explain Solution: This part accounts for #‘1’
#‘0’ ≤ #‘1’ ≤ #‘0’ × 2 q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 36

37 PDA – Example 1 – Explain Solution: Consume one ‘X’ and eats one ‘1’
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 37

38 PDA – Example 1 – Explain Solution: Consume one ‘X’ and eats two ‘1’
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 38

39 PDA – Example 1 – Explain Solution: Consume one ‘X’, and then
eats one ‘1’, or eat two ‘1’ q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 39

40 PDA – Example 1 – Explain Solution: Indicates the end of parsing
q0 e,e/$ 0,e/X e,e/e q1 q2 e,$/e 1,X/e 1,X/X q3 L = {0i1j: i ≤ j ≤ 2i, i=0,1,…}, S = {0, 1} 40

41 PDA – Example 2 Given the following language: Design a PDA for it
L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 41

42 PDA – Example 2 – Idea Idea:
Sequentially read (multiple) ‘a’, ‘b’, ‘c’ and ‘d’ Maintain: #‘a’ + #‘c’ #‘b’ + #‘d’ If these numbers equal Accept 42

43 PDA – Example 2 – Solution
e,e/$ q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 43

44 PDA – Example 2 – Explain Solution:
q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY start a b c d end L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 44

45 PDA – Example 2 – Explain Solution: Each X in stack = An extra a or c
q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 45

46 PDA – Example 2 – Explain Solution: Each Y in stack = An extra b or d
q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 46

47 PDA – Example 2 – Explain Solution: X and Y ‘cancel’ each other
The stack contains only X’s or only Y’s e,e/$ q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 47

48 PDA – Example 2 – Explain Solution: No X’s and no Y’s means
#a + #c = #b + #d  Accept e,e/$ q5 q1 a,e/X e,e/e b,$/$Y q2 c,X/XX q3 q4 e, $ /e b,X/e b,Y/YY c,$/$X c,Y/e d,X/e d,$/$Y d,Y/YY L = { aibjckdl: i, j, k, l=0,1,…; i+k=j+l }, where the alphabet Σ= {a, b, c, d} 48


Download ppt "CSCI 3130: Formal Languages and Automata Theory Tutorial 5"

Similar presentations


Ads by Google