Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lab 3: Using ML-Yacc Zhong Zhuang

Similar presentations


Presentation on theme: "Lab 3: Using ML-Yacc Zhong Zhuang"— Presentation transcript:

1 Lab 3: Using ML-Yacc Zhong Zhuang dyzz@mail.ustc.edu.cn

2 How to write a parser?  Write a parser by hand  Use a parser generator  May not be as efficient as hand-written parser  General and robust  How it works? Parser Specification parser generator Parser abstract syntax stream of tokens

3 ML-Yacc specification  Three parts again User Declarations: declare values available in the rule actions % ML-Yacc Definitions: declare terminals and non-terminals; special declarations to resolve conflicts % Rules: parser specified by CFG rules and associated semantic action that generate abstract syntax

4 ML-Yacc Definitions  specify type of positions %pos int * int  specify terminal and nonterminal symbols %term IF | THEN | ELSE | PLUS | MINUS... %nonterm prog | exp | op  specify end-of-parse token %eop EOF  specify start symbol (by default, non terminal in LHS of first rule) %start prog

5 A Simple ML-Yacc File % %term NUM | PLUS | MUL | LPAR | RPAR %nonterm exp | fact | base %pos int %start exp %eop EOF % exp : fact () | fact PLUS exp() fact : base () | base MUL factor() base : NUM() | LPAR exp RPAR () grammar rules semantic actions (currently do nothing) grammar symbols

6  each nonterminal may have a semantic value associated with it  when the parser reduces with (X ::= s)  a semantic action will be executed  uses semantic values from symbols in s  when parsing is completed successfully  parser returns semantic value associated with the start symbol  usually a syntax tree

7  to use semantic values during parsing, we must declare symbol types:  %terminal NUM of int | PLUS | MUL |...  %nonterminal exp of int | fact of int | base of int  type of semantic action must match type declared for the nonterminal in rule

8 A Simple ML-Yacc File with Action % %term NUM of int | PLUS | MUL | LPAR | RPAR %nonterm exp of int | fact of int | base of int %pos int %start exp %eop EOF % exp : fact (fact) | fact PLUS exp(fact + exp) fact : base (base) | base MUL base(base1 * base2) base : NUM(NUM) | LPAR exp RPAR (exp) grammar rules with semantic actions grammar symbols with type declarations computing integer result via semantic actions

9 Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read

10 Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E  Result is : E+(E*E) exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read ShiftE+E* ShiftE+E*E ReduceE+E ReduceE If we shift

11 Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E  Result is: (E+E)*E exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read ReduceE ShiftE* ShiftE*E ReduceE If we reduce

12  This is a shift-reduce conflict  We want E+E*E, because “*” has higher precedence than “+”  Another shift-reduce conflict  Tokens from lexer  NUM PLUS NUM PLUS NUM   State of Parser  E+E  Result is : E+(E+E) and (E+E)+E To be read ShiftE+E+ ShiftE+E+E ReduceE+E ReduceE If we shift ReduceE ShiftE+ ShiftE+E ReduceE If we reduce

13 Deal with shift-reduce conflicts  This case, we need to reduce, because “+” is left associative  Deal with it!  let ML-Yacc complain.  default choice is to shift when it encounters a shift-reduce error  BAD: programmer intentions unclear; harder to debug other parts of your grammar; generally inelegant  rewrite the grammar to eliminate ambiguity  can be complicated and less clear  use Yacc precedence directives  %left, %right %nonassoc

14 Precedence and Associativity  precedence of terminal based on order in which associativity is specified  precedence of rule is the precedence of the right- most terminal  eg: precedence of (E ::= E + E) == prec(+)  a shift-reduce conflict is resolved as follows  prec(terminal) > prec(rule) ==> shift  prec(terminal) reduce  prec(terminal) = prec(rule) ==>  assoc(terminal) = left ==> reduce  assoc(terminal) = right ==> shift  assoc(terminal) = nonassoc ==> report as error

15  datatype exp = Int of int | Add of exp * exp | Sub of exp * exp | Mul of exp * exp | Div of exp *exp % %left PLUS MINUS %left MUL DIV % exp : NUM (Int NUM) | exp PLUS exp(Add (exp1, exp2)) | exp MINUS exp (Sub (exp1, exp2)) | exp MUL exp (Mul (exp1, exp2)) | exp DIV exp(Div (exp1, exp2)) | LPAR exp RPAR (exp) Higher precedence

16 Reduce-reduce Conflict  This kind of conflict is more difficult to deal with  Example  When we get a “word” from lexer,  word -> maybeword -> sequence (rule 1)  empty –> sequence word -> sequence (rule 2)  We have more than one way to get “sequence” from input “word” sequence::= | maybeword | sequence word maybeword: := | word

17 Reduce-reduce Conflict  Reduce-reduce conflict means there are two or more rules that apply to the same sequence of input. This usually indicates a serious error in the grammar.  ML-Yacc reduce by first rule  Generally, reduce-reduce conflict is not allowed in your ML-Yacc file  We need to fix our grammar sequence::= | sequence word

18 Summary of conflicts  Shift-reduce conflict  precedence and associativity  Shift by default  Reduce-reduce conflict  reduce by first rule  Not allowed!

19 Lab3  Your job is to finish a parser for C language  Input: A “.c” file  Output: “Success!” if the “.c” file is correct  File description  c.lex  c.grm  main.sml  call-main.sml  sources.cm  lab3.mlb  test.c

20 Using ML-Yacc  Read the ML-Yacc Manual  Run  If your finish “c.grm” and “c.lex”  In command-line: (use MLton’s)  mlyacc c.grm  mllex c.lex  we will get  “c.grm.sig”, “c.grm.sml”, “c.grm.desc”, “c.lex.sml”  Then compile Lab3  Start SML/NJ, Run CM.make “sources.cm”;  or in command-line, mlton lab3.mlb  To run lab3  In SML/NJ, Main.parse “test.c”;  or in command-line, lab3 test.c

21 “Debug” ML-Yacc File  When you run mlyacc, you’ll see error messages if your ml-yacc file has conflicts. For example,  mlyacc c.grm  2 shift/reduce conflicts  open file “c.grm.desc”(This file is generated by mlyacc)  The beginning of this file  the rest are all the states  rule 12 means the 12 th rule (from 0) in your ML-Yacc file 2 shift/reduce conflicts error: state 0: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12) error: state 1: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12) state 0:prog :. structs vdecs preds funcs MYSTRUCTshift 3proggoto 429 structsgoto 2structdecgoto 1.reduce by rule 12

22 Use ML-lex with ML-yacc  Most of the work in “c.lex” this time can be copied from Lab2  You can re-use Regular expressions and Lexical rules  Difference with Lab2  You have to define “token” in “c.grm”  %term INT of int | EOF  “%term” in “c.grm” will be automatically in “c.grm.sig” signature C_TOKENS = sig type ('a,'b) token type svalue val EOF: 'a * 'a -> (svalue,'a) token val INT: (int) * 'a * 'a -> (svalue,'a) token end

23 Hints  Read ML-Yacc Manual  Read the language specification  Test a lot!


Download ppt "Lab 3: Using ML-Yacc Zhong Zhuang"

Similar presentations


Ads by Google