Presentation is loading. Please wait.

Presentation is loading. Please wait.

Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.

Similar presentations


Presentation on theme: "Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University."— Presentation transcript:

1 Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University

2 Outline Overview. Bottom-Up Parsing. LR Parsing. Examples.

3 Front-End Components Scanner Source program (text stream) Parser Intermediate Representation (file or in memory) Semantic Analyzer Front-End Construct parse tree. Group token. next-token token Symbol Table main(){ Check semantic/contextual. identifier main symbol ( parse-tree

4 Parsing Techniques Top-down parsing  LL(1) grammars  Left-to-right scanning, Leftmost derivation, 1 symbol lookahead Bottom-up parsing  LR(k) grammars  Left-to-right scanning, Rightmost derivation, k symbols lookahead

5 Top-Down Parsing Recursive Decent Parser  simple top-down parser with Backtracking Predictive Parser  non-backtracking  use FIRST and FOLLOW sets

6 Basic Terminologies Sentential Form. Sentential form (left/right)

7 Example: Left Sentential Form (1) E ฎ E + T (2) E ฎ T (3) T ฎ T * F (4) T ฎ F (5) F ฎ (E) (6) F ฎ id m + n * k id + id * id E => E + T => T + T => id + T => id + T * F => id + F * F => id + id * F => id + id * id

8 Example: Right Sentential Form (1) E ฎ E + T (2) E ฎ T (3) T ฎ T * F (4) T ฎ F (5) F ฎ (E) (6) F ฎ id E => E + T => E + T * F => E + T * id => E + id * id => T + id * id => F + id * id => id + id * id Rightmost derivation is also called “canonical derivation”.

9 Bottom-Up Parsing Starting from the bottom of the parse tree and reduce all terminals until getting only one starting symbol. Characteristics  Rightmost derivation in reverse.  Find the “handle” and reduce.

10 Rightmost Derivation in Reverse (1) E ฎ E + T (2) E ฎ T (3) T ฎ T * F (4) T ฎ F (5) F ฎ (E) (6) F ฎ id E => E + T => E + T * F => E + T * id => E + id * id => T + id * id => F + id * id => id + id * id During parsing: id + id * idF + id * idT + id * id

11 Basic Terminologies Handle  A substring that matches the right side of a production.  Whose reduction (with that production) will eventually lead to the starting symbol.

12 Example: Handle (1) E ฎ E + T (2) E ฎ T (3) T ฎ T * F (4) T ฎ F (5) F ฎ (E) (6) F ฎ id E => E + T => E + T * F => E + T * id => E + id * id => T + id * id => F + id * id => id + id * id Note: for right-sentential form, the string on the right of a handle contains only terminals. Not a handle Handle

13 Shift-Reduce Parsing shift input string on to the stack. reduce the handle on the stack to a non- terminal. try to reduce input to the starting variable.

14 Model of Shift-Reduce Parsing Stack + input = current right-sentential form. Locate the handle during the parsing:  shift zero or more input onto the stack until a handle is  on top of the stack. Replace the handle with a proper non- terminal (Handle Pruning):  reduce  to A where A ฎ 

15 Example $id + id * id$ $F + id * id$ $T + id * id$ $E + id * id$ $E + F * id$ $E + T * id$ $E + T * F $ $E + T $ $E $ Shift Reduce (F->id) Reduce (T->F) Reduce (E->T) Shift Reduce (F->id) Reduce (T->F) Shift Reduce (F->id) Reduce (T->T*F) Reduce (E->E+T) Accept

16 LR Parsing Algorithms Use grammar to construct a parsing table. Three techniques:  Simple LR (SLR)  Canonical LR (LR)  Look Ahead LR (LALR) Same algorithm but different ways to construct a parsing table.

17 Model of an LR Parser

18 LR Parsing Tables Two tables: action and goto. action[s m, a i ]  shift s  reduce A ฎ    accept  error goto[s m, X i ] = target state (If action[s m, a i ] = shift s, goto[s m, a i ] = s)

19 Example: Action Table Example:  Input = id, state = 0  Next action = Shift  Input = +, state = 3  Next action = Reduce  Input = $, state = 1  Next action = Accept  Input = id, state = 1  Next action = Error Stateid+ … $ 0s 1sa 2rr 3rr...

20 Configuration Stack contents and unread input: (s 0 X 1 s 1 X 2 s 2 … X m s m, a i a i+1 … a n $) This represents right-sentential form: X 1 X 2 … X m a i a i+1 … a n

21 LR Parser Movements If action[s m, a i ] = shift s, shift move: (s 0 X 1 s 1 X 2 s 2 … X m s m, a i a i+1 … a n $) (s 0 X 1 s 1 X 2 s 2 … X m s m a i s, a i+1 … a n $)

22 LR Parser Movements If action[s m, a i ] = reduce A ฎ , reduce move: (s 0 X 1 s 1 X 2 s 2 … X m s m, a i a i+1 … a n $) (s 0 X 1 s 1 X 2 s 2 … X m-r s m-r A s, a i a i+1 … a n $) s = goto[s m-r, A]r = |  |

23 LR Parser Movements If action[s m, a i ] = accept, done. If action[s m, a i ] = error, error.

24 Example (1) E ฎ E + T (2) E ฎ T (3) T ฎ T * F (4) T ฎ F (5) F ฎ (E) (6) F ฎ id

25 Stateid+*()$ETF 0s5s4123 1s6acc 2r2s7r2 3r4 4s5s4823 5r6 6s5s493 7s5s410 8s6s11 9r1s7r1 10r3 11r5

26 Conflicts Parser cannot decide:  shift/reduce conflict can either shift or reduce.  reduce/reduce conflict more than one production is eligible. Usually ambiguous or non-LR grammars.

27 Example: Shift/Reduce Conflict stmt ฎ  if expr then stmt | if expr then stmt else stmt |... STACKINPUT $ … if expr then stmtelse … $

28 Example: Reduce/Reduce Conflict E ฎ  T | F T ฎ  id F ฎ  id STACKINPUT $ id… $


Download ppt "Syntactic Analysis Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University."

Similar presentations


Ads by Google