Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 44 – Jan. 29 Expression grammars –Associativity √ –Precedence CFG for entire language (handout) CYK algorithm –General technique for testing for acceptance.

Similar presentations


Presentation on theme: "CS 44 – Jan. 29 Expression grammars –Associativity √ –Precedence CFG for entire language (handout) CYK algorithm –General technique for testing for acceptance."— Presentation transcript:

1 CS 44 – Jan. 29 Expression grammars –Associativity √ –Precedence CFG for entire language (handout) CYK algorithm –General technique for testing for acceptance (Note we don’t want PDA since it could be nondeterministic!)

2 Precedence (* /) bind stronger than (+ -) (+ -) separate better than (* /) Need to break up expression into terms –Ex. 9 – 8 * 2 + 4 / 5 –We want to say that an expression consists of “terms” separated by + and – –And each term consists of numbers separated by * and / –But which should we define first, expr or term?

3 Precedence (2) Which grammar is right? expr  expr + term | expr – term | term term  term * num | term / num | num Or this one: expr  expr * term | expr / term | term term  term + num | term – num | num Let’s try examples 1 + 2 * 3 and 1 * 2 + 3

4 Moral If a grammar is defining something hierarchical, like an expression, define large groupings first. Lower precedence operators appear first in grammar. (They separate better) –Ex. * appears lower in parse tree than + because it gets evaluated first. In a real programming language, there can be more than 10 levels of precedence. C has ~15!

5 C language Handout –How does the grammar begin? –Where are the mathematical expressions? –Do you agree with the precedence? –Do you see associativity? –What else is defined in grammar? –Where are the terminals?

6 Accepting input How can we tell if a given source file (input stream of tokens) is a valid program? Language defined by CFG, so … –Can see if there is some derivation from grammar? –Can convert CFG to PDA? Exponential performance not acceptable. (e.g. doubling every time we add token) Two improvements: –CYK algorithm, runs in O(n 3 ) –Bottom-up parsing, generally linear, but restrictions on grammar.

7 CYK algorithm In 1965-67, discovered independently by Cocke, Younger, Kasami. Given any CFG and any string, can tell if grammar generates string. The grammar needs to be in CNF first. –This ensures that the rules are simple. Rules are of the form X  a or X  YZ Consider all substrings of len 1 first. See if these are in language. Next try all len 2, len 3, …. up to length n.

8 continued Maintain results in an NxN table. Top right portion not used. –Example on right is for testing word of length 3. Start at bottom; work your way up. For length 1, just look for “unit rules” in grammar, e.g. X  a. 1..3 XX 1..22..3 X 1..12..23..3

9 continued For general case i..j –Think of all possible ways this string can be broken into 2 pieces. –Ex. 1..3 = 1..2 + 3..3 or 1..1 + 2..3 –We want to know if both pieces  L. This handles rules of form A  BC. Let’s try example from 3 + 7 +. (in CNF) 1..3 XX 1..22..3 X 1..12..23..3

10 337  3 + 7 + ? S  AB A  3 | AC B  7 | BD C  3 D  7 For each len 1 string, which variables generate it? 1..1 is 3. Rules A and C. 2..2 is 3. Rules A and C. 3..3 is 7. Rules B and D. 1..3 XX 1..22..3 X 1..1 A, C 2..2 A, C 3..3 B, D

11 337  3 + 7 + ? S  AB A  3 | AC B  7 | BD C  3 D  7 Length 2: 1..2 = 1..1 + 2..2 = (A or C)(A or C) = rule A 2..3 = 2..2 + 3..3 = (A or C)(B or D) = rule S 1..3 XX 1..2 A 2..3 S X 1..1 A, C 2..2 A, C 3..3 B, D

12 337  3 + 7 + ? S  AB A  3 | AC B  7 | BD C  3 D  7 Length 3: 2 cases for 1..3: 1..2 + 3..3: (A)(B or D) = S 1..1 + 2..3: (A or C)(S) no! We only need one case to work. 1..3 S XX 1..2 A 2..3 S X 1..1 A, C 2..2 A, C 3..3 B, D

13 Example #2 Let’s test the word baab S  AB | BC A  BA | a B  CC | b C  AB | a Length 1: ‘a’ generated by A, C ‘b’ generated by B 1..4 XXX 1..32..4 XX 1..22..33..4 X 1..1 B 2..2 A, C 3..3 A, C 4..4 B

14 baab S  AB | BC A  BA | a B  CC | b C  AB | a Length 2: 1..2 = 1..1 + 2..2 = (B)(A, C) = S,A 2..3 = 2..2 + 3..3 = (A,C)(A,C) = B 3..4 = 3..3 + 3..4 = (A,C)(B) = S,C 1..4 XXX 1..32..4 XX 1..2 S, A 2..3 B 3..4 S, C X 1..1 B 2..2 A, C 3..3 A, C 4..4 B

15 baab S  AB | BC A  BA | a B  CC | b C  AB | a Length 3: [ each has 2 chances! ] 1..3 = 1..2 + 3..3 = (S,A)(A,C) = Ø 1..3 = 1..1 + 2..3 = (B)(B) = Ø 2..4 = 2..3 + 4..4 = (B)(B) = Ø 2..4 = 2..2 + 3..4 = (A,C)(S,C) = B 1..4 XXX 1..3 Ø 2..4 B XX 1..2 S, A 2..3 B 3..4 S, C X 1..1 B 2..2 A, C 3..3 A, C 4..4 B

16 Finally… S  AB | BC A  BA | a B  CC | b C  AB | a Length 4 [has 3 chances!] 1..4 = 1..3 + 4..4 = (Ø)(B) = Ø 1..4 = 1..2 + 3..4 = (S,A)(S,C) = Ø 1..4 = 1..1 + 2..4 = (B)(B) = Ø Ø means we lose! baab  L. However, in general don’t give up if you encounter Ø in the middle of the process. 1..4 Ø XXX 1..3 Ø 2..4 B XX 1..2 S, A 2..3 B 3..4 S, C X 1..1 B 2..2 A, C 3..3 A, C 4..4 B


Download ppt "CS 44 – Jan. 29 Expression grammars –Associativity √ –Precedence CFG for entire language (handout) CYK algorithm –General technique for testing for acceptance."

Similar presentations


Ads by Google