Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parsing Discrete Mathematics and Its Applications Baojian Hua

Similar presentations


Presentation on theme: "Parsing Discrete Mathematics and Its Applications Baojian Hua"— Presentation transcript:

1 Parsing Discrete Mathematics and Its Applications Baojian Hua bjhua@ustc.edu.cn

2 Syntax Tree A systematic way to put some program into memory data type definition + a bunch of functions programmer explicit calls them tedious and error-prone But we write programs in ASCII form, so how can we construct the tree automatically? A technique called (automatic) parsing A program clever enough to do this automatically

3 Roadmap Lexer: eat ascii sequence, emit token sequence Parser: eat token sequence, emit abstract syntax trees other part: later in this course LexerParser stream of characters stream of tokens abstract syntax other part

4 Parsing Take as input a sequence of terminals, and construct syntax trees automatically Problem: how do we know whether a sequence of input tokens is valid?

5 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Am I valid? S -> x A Oops, mismatch! Nonterminals: S A B C terminals: ID_X ID_Y ID_U ID_V ID_T ID_M ID_W ID_Z

6 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Am I valid? S -> y B Another try!

7 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Am I valid? S -> y B Aha, Great!

8 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ m z ” S -> y B Recursion

9 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ m z ” S -> y B B -> t C First try

10 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ m z ” S -> y B B -> t C Mismatch

11 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ m z ” S -> y B B -> m C Second try

12 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ z ” S -> y B B -> m C Recursion

13 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ z ” S -> y B B -> m C C -> w Mismatch

14 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ z ” S -> y B B -> m C C -> z Second try

15 Example S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z y m z Derive “ z ” S -> y B B -> m C C -> z Matched. Sucess S yB mC z

16 Recursive Decedent Algorithm This process can be described by a recursive decedent algorithm For each nonterminal, write a (recursive) parsing function every RHS becomes a case in a big switch function may take some semantic actions, besides parsing later in this course

17 Recursive Decedent Algorithm // The function interface looks like: void parseS (); void parseA (); void parseB (); void parseC (); S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z

18 Recursive Decedent Algorithm struct token t; // recall the token module t = getToken (); // and the lexer module void parserS () { switch (t) { case ID_X: t = getToken (); parseA (); return; case ID_Y: t = getToken (); parseB (); return; S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z

19 Recursive Decedent Algorithm default: // not ID_X or ID_Y error (“want ‘x’ or ‘y’); return; } } // Leave the algorithm for // parseA (), parseB () and // parseC () to you. S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z

20 Summary so Far Recursive decedent parsing: also called predictive parsing, or top-down parsing simple and efficient can be coded by hand quickly see problem 2 in lab #4 But the constraint is that not all formal grammar can be parsed by a recursive decedent parser Example below

21 Example S ::= x A | x B A ::= m C | m C B ::= m z | m z C ::= w | y x m z Derive me S -> x A or S -> x B

22 Recursive Decedent Algorithm? struct token t; // recall the token module t = getToken (); // and the lexer module void parserS () { switch (t) { case ID_X: t = getToken (); ?????? // what code here? return; default: error (“….”); return;} } S ::= x A | x B A ::= m C | m C B ::= m z | m z C ::= w | y

23 Another Example stm -> id = exp; | id = exp; stm exp -> exp + exp | exp - exp | num | id | (exp) x = 3+4; y = x-(1+2); Derive me stm -> id = exp; or stm -> id = exp; stm

24 Moral We ’ d introduce a notion of what ’ s a production ’ s RHS s could start with s\in (T\/N)* We call it a first (terminal) set, written as F[s], for string s\in (T\/N)* Next, we first compute the first set F for given terminals or nonterminals

25 First Set F F [S] = {x, y} F [A] = {u, v} F [B] = {t, m} F [C] = {w, z} // And generalize to string F [x A] = {x} F [y B] = {y} F [u C] = {u} F [v C] = {v} … S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z

26 First Set F F (S) = {x, y} F (A) = {u, v} F (B) = {t, m} F (C) = {w, z} // And generalize to string F (x A) = {x} F (y B) = {y} F (u C) = {u} F (v C) = {v} … // Then why this grammar could be parsed? S ::= x A | y B A ::= u C | v C B ::= t C | m C C ::= w | z

27 Parsing Table xyuvtm… Ss->x AS->y B AA->u CA -> v C BB->t CB->m C C…

28 Example S ::= x A | x B A ::= m C | m C B ::= m z | m z C ::= w | y F (S) = ? F (A) = ? F (B) = ? F (C) = ? Predicative parsing?

29 Another Example stm -> id = exp; | id = exp; stm exp -> exp + exp | exp - exp | num | id | (exp) F (stm) = ? F (exp) = ? Predicative parsing?

30 Empty Production Rules Z ::= d | X Y Z Y ::= c | \eps X ::= Y | a F (Z) = ? F (Y) = ? F (X) = ? Predicative parsing?

31 Algorithm #1: nullable // nullable[X]: whether X derives \eps or not // all initialized to false repeat for each production rule X -> Y1 Y2 … Yn if (nullable[Y1, …, Yn]=true) or (n=0)) nullable[X] = true until nullable[] did not change

32 Example Z ::= d | X Y Z Y ::= c | \eps X ::= Y | a // initialization nullable[Z, Y, X] = false // round 1 nullable[Z] = false nullable[Y] = true nullable[X] = true // round 2 nullable[Z] = false nullable[Y] = true nullable[X] = true // finished!

33 Algorithm #2: First Set // F[X]: first terminals X could derive // all initialized to empty repeat for each production rule X -> Y1 Y2 … Yn for (i=1 to n) if (nullable[Y1, Y2, Y_{i-1}]=true or (i=1)) F[X] = F[X] \/ F[Yi] until F[] did not change

34 Example Z ::= d | X Y Z Y ::= c | \eps X ::= Y | a // initialization F[Z, Y, X] = {} // round 1 F[Z] = {d} F[Y] = {c} F[X] = {c, a} // round 2 F[Z] = {d, c, a} F[Y] = {c} F[X] = {c, a} // round 3…

35 Pitfalls S ::= x A B A ::= y | \eps B ::= y | \eps Try to calculate nullable[] and F[] Try to derive “x y” What’s the problem?

36 Algorithm #3: Follow Set // W[X]: terminals may follow X // all initialized to empty repeat for each production rule X -> Y1 Y2 … Yn for (i=1 to n) for (j=i+1 to n) if (nullable[Y_{i+1}, …, Yn]=true or (i=n)) W[Yi] = W[Yi] \/ W[X] if (nullable[Y_{i+1}, …, Y_{j-1}]=true or(i+1=j)) W[Yi] = W[Yi] \/ F[Yj] until W[] did not change

37 Example Z ::= d | X Y Z Y ::= c | \eps X ::= Y | a // initialization W[Z, Y, X] = {} // round 1 W[Z] = {EOF} W[Y] = {d, c, a} W[X] = {c, d, a} // round 2 W[Z] = {EOF} W[Y] = {d, c, a} W[X] = {d, c, a} // finished!

38 First and Follow Together Z ::= d | X Y Z Y ::= c | \eps X ::= Y | a // first F[Z] = {d, c, a} F[Y] = {c} F[X] = {c, a} // follow W[Z] = {EOF} W[Y] = {d, c, a} W[X] = {d, c, a} // first All[X] = { F[X], if nullable[X]=false { W[X], otherwise.

39 Parsing Table acd ZZ->d Z->X Y Z YY->c Y->\eps XX->Y X->a

40 LL(1) Grammar whole predicative parsing tables contain no duplicate entries are called LL(1) left-to-right parse, left-to-right-derivation, 1- symbol lookahead one pass, no backtracking, very efficient precise error reports For some non-LL(1) grammar, there are some standard methods to transform them example below

41 Eliminating Left Recursion X -> X a | c // General transforming rules: X -> X α1 | … | X αm | β1 | … | βn // to X -> β1 X’ | … | βn X’ X’ -> α1 X’ | … | αm X’ | \eps X -> c X’ X’ -> a X’ | \eps

42 Eliminating Left Recursion S -> if (E) then S else S; | if (E) then S; // General rules: X -> α X1 | … | α Xn // to X -> α X’ X’ -> X1 | … | Xn S -> if (E) then S S’ S’ -> else S; | ;

43 Eliminating Ambiguity In programming language syntax, ambiguity often arises from missing operator precedence or associativity * higher precedence than +? * and + are left associative? Ambiguious grammar are hard to use as language syntax

44 Ambiguous grammars A grammar is ambiguous if there is a sentence with >1 parse tree 15 - 3 - 4 E E-E 15 E -E 3 4 E E-E 4 E -E 3

45 Association exp -> exp - exp | num // Derivation #1: exp -> exp - exp -> 15 - exp -> 15 - exp - exp -> 15 - 3 - exp -> 15 - 3 - 4 // Derivation #2: exp -> 15 X -> 15 - 3 X -> 15 - 3 - 4 X -> 15 - 3 - 4 exp -> num X X -> - num X | \eps

46 Precedence exp -> exp - exp | exp * exp | num // But the derivation: 3-4*5 exp -> 3 X -> 3 - 4 X -> 3 - 4 * 5 X -> 3 - 4 * 5 \eps -> 3 - 4 * 5 // What’s the problem? exp -> num X X -> - num X | * num X | \eps

47 Precedence exp -> exp - exp | exp * exp | num // The derivation: 3-4*5 exp -> term X -> 3 Y X -> 3 \eps X -> 3 X -> 3 - term X -> 3 - 4 Y X -> 3 - 4 * 5 Y X -> 3 - 4 * 5 \eps X -> 3 - 4 * 5 X -> 3 - 4 * 5 \eps -> 3 - 4 * 5 exp -> term X X -> - term X | \eps term -> num Y Y -> * num Y | \eps

48 Parsing Function // Given production: X -> s // Parsing function has the form: void parseX () { trans (s); } // case analysis on possible shape of s: // 1. s == a; for some terminal a trans (a) = if (t==a) t = nextToken (); else error (“syntax error: expecting: a”);

49 Parsing Function // Given production: X -> s void parseX () { trans (s); } // case analysis on possible shape of s: // 2. s == Y; for some nonterminal Y trans (Y) = Y ()

50 Parsing Function // Given production: X -> s void parseX () { trans (s); } // case analysis on possible shape of s: // 3. s == s1 s2 … sn trans (s1 s2 … sn) = trans(s1) trans(s2) … trans(sn)

51 Parsing Function // Given production: X -> s void parseX () { trans (s); } // case analysis on possible shape of s: // 4. s == s1 | s2 | … | sn if (t\in All[s1]) trans(s1) else if (t\in All[s2]) trans(s2) … else if (t\in All[sn]) trans(sn) else error (“syntax error: …”);

52 Parsing Function // Given production: X -> s void parseX () { trans (s); } // case analysis on possible shape of s: // 5. s == [s’] if (t\in All[s’]) trans(s’)

53 Parsing Function // Given production: X -> s void parseX () { trans (s); } // case analysis on possible shape of s: // 6. s == {s’} while (t\in All[s’]) trans(s’)

54 Parsing Function Generating Trees // We only analyze case 3, other are similar: // 3. s == s1 s2 … sn S parseS () { S1 v1 = parseS1 (); S2 v2 = parseS2 (); …; Sn vn = parseSn (); return newS (v1, …, vn); }

55 David Cutler “ 编程绝对是你遇到的 最奇怪的事情,代码写 好后,你坚信自己是对的, 然而运行结果却证明你 错了。你错了是因为你从来都是错的,只 不过是编程让你第一次意识到了这一 点。 ”

56 Automatic Tools semantic analyzer specification parser Yacc Originally developed for C, and now almost every main-stream language has its own Yacc-like tool: bison (C), ml-yacc (SML), Cup (Java), GPPG (C#), …


Download ppt "Parsing Discrete Mathematics and Its Applications Baojian Hua"

Similar presentations


Ads by Google