Lab 3: Using ML-Yacc Zhong Zhuang

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

Abstract Syntax Mooly Sagiv html:// 1.
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
Mooly Sagiv and Roman Manevich School of Computer Science
Yacc YACC BNF grammar example.y Other modules example.tab.c Executable
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
Abstract Syntax Trees Compiler Baojian Hua
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Context-Free Grammars Lecture 7
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom Up Parsing.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Cse321, Programming Languages and Compilers 1 6/30/2015 Lecture #11, Feb. 19, 2007 ml-Yacc Actions when reducing Making ml-yacc work with ml-lex Boiler.
COS 320 Compilers David Walker. last time context free grammars (Appel 3.1) –terminals, non-terminals, rules –derivations & parse trees –ambiguous grammars.
Chapter 2 A Simple Compiler
Winter 2003/4Pls – syntax – Catriel Beeri1 SYNTAX Syntax: form, structure The syntax of a pl: The set of its well-formed programs The rules that define.
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
LEX and YACC work as a team
LR Parsing Compiler Baojian Hua
Top-Down Parsing - recursive descent - predictive parsing
Automated Parser Generation (via CUP)CUP 1. High-level structure JFlexjavac Lexer spec Lexical analyzer text tokens.java CUPjavac Parser spec.javaParser.
Syntax and Backus Naur Form
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Profs. Necula CS 164 Lecture Top-Down Parsing ICOM 4036 Lecture 5.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
LANGUAGE TRANSLATORS: WEEK 17 scom.hud.ac.uk/scomtlm/cis2380/ See Appel’s book chapter 3 for support reading Last Week: Top-down, Table driven parsers.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CS308 Compiler Principles Introduction to Yacc Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University.
CPS 506 Comparative Programming Languages Syntax Specification.
Abstract Syntax Trees Compiler Baojian Hua
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
Top-Down Parsing CS 671 January 29, CS 671 – Spring Where Are We? Source code: if (b==0) a = “Hi”; Token Stream: if (b == 0) a = “Hi”; Abstract.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
Top-Down Parsing.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Syntax Analysis Or Parsing. A.K.A. Syntax Analysis –Recognize sentences in a language. –Discover the structure of a document/program. –Construct (implicitly.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
Parser Generation Tools (Yacc and Bison) CS 471 September 24, 2007.
Announcements/Reading
Compiler Baojian Hua LR Parsing Compiler Baojian Hua
Chapter 3 Context-Free Grammar and Parsing
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 4 Syntax Analysis.
Fall Compiler Principles Lecture 4: Parsing part 3
CPSC 388 – Compiler Design and Construction
Bison Marcin Zubrowski.
Parsing #2 Leonidas Fegaras.
Parsing #2 Leonidas Fegaras.
Compiler Structures 7. Yacc Objectives , Semester 2,
Lecture 18 Bottom-Up Parsing or Shift-Reduce Parsing
Presentation transcript:

Lab 3: Using ML-Yacc Zhong Zhuang

How to write a parser?  Write a parser by hand  Use a parser generator  May not be as efficient as hand-written parser  General and robust  How it works? Parser Specification parser generator Parser abstract syntax stream of tokens

ML-Yacc specification  Three parts again User Declarations: declare values available in the rule actions % ML-Yacc Definitions: declare terminals and non-terminals; special declarations to resolve conflicts % Rules: parser specified by CFG rules and associated semantic action that generate abstract syntax

ML-Yacc Definitions  specify type of positions %pos int * int  specify terminal and nonterminal symbols %term IF | THEN | ELSE | PLUS | MINUS... %nonterm prog | exp | op  specify end-of-parse token %eop EOF  specify start symbol (by default, non terminal in LHS of first rule) %start prog

A Simple ML-Yacc File % %term NUM | PLUS | MUL | LPAR | RPAR %nonterm exp | fact | base %pos int %start exp %eop EOF % exp : fact () | fact PLUS exp() fact : base () | base MUL factor() base : NUM() | LPAR exp RPAR () grammar rules semantic actions (currently do nothing) grammar symbols

 each nonterminal may have a semantic value associated with it  when the parser reduces with (X ::= s)  a semantic action will be executed  uses semantic values from symbols in s  when parsing is completed successfully  parser returns semantic value associated with the start symbol  usually a syntax tree

 to use semantic values during parsing, we must declare symbol types:  %terminal NUM of int | PLUS | MUL |...  %nonterminal exp of int | fact of int | base of int  type of semantic action must match type declared for the nonterminal in rule

A Simple ML-Yacc File with Action % %term NUM of int | PLUS | MUL | LPAR | RPAR %nonterm exp of int | fact of int | base of int %pos int %start exp %eop EOF % exp : fact (fact) | fact PLUS exp(fact + exp) fact : base (base) | base MUL base(base1 * base2) base : NUM(NUM) | LPAR exp RPAR (exp) grammar rules with semantic actions grammar symbols with type declarations computing integer result via semantic actions

Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read

Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E  Result is : E+(E*E) exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read ShiftE+E* ShiftE+E*E ReduceE+E ReduceE If we shift

Conflicts in ML-Yacc  We often write ambiguous grammar  Example  Tokens from lexer  NUM PLUS NUM MUL NUM   State of Parser  E+E  Result is: (E+E)*E exp ::= NUM | exp PLUS exp | exp MUL exp | LPAR exp RPAR To be read ReduceE ShiftE* ShiftE*E ReduceE If we reduce

 This is a shift-reduce conflict  We want E+E*E, because “*” has higher precedence than “+”  Another shift-reduce conflict  Tokens from lexer  NUM PLUS NUM PLUS NUM   State of Parser  E+E  Result is : E+(E+E) and (E+E)+E To be read ShiftE+E+ ShiftE+E+E ReduceE+E ReduceE If we shift ReduceE ShiftE+ ShiftE+E ReduceE If we reduce

Deal with shift-reduce conflicts  This case, we need to reduce, because “+” is left associative  Deal with it!  let ML-Yacc complain.  default choice is to shift when it encounters a shift-reduce error  BAD: programmer intentions unclear; harder to debug other parts of your grammar; generally inelegant  rewrite the grammar to eliminate ambiguity  can be complicated and less clear  use Yacc precedence directives  %left, %right %nonassoc

Precedence and Associativity  precedence of terminal based on order in which associativity is specified  precedence of rule is the precedence of the right- most terminal  eg: precedence of (E ::= E + E) == prec(+)  a shift-reduce conflict is resolved as follows  prec(terminal) > prec(rule) ==> shift  prec(terminal) reduce  prec(terminal) = prec(rule) ==>  assoc(terminal) = left ==> reduce  assoc(terminal) = right ==> shift  assoc(terminal) = nonassoc ==> report as error

 datatype exp = Int of int | Add of exp * exp | Sub of exp * exp | Mul of exp * exp | Div of exp *exp % %left PLUS MINUS %left MUL DIV % exp : NUM (Int NUM) | exp PLUS exp(Add (exp1, exp2)) | exp MINUS exp (Sub (exp1, exp2)) | exp MUL exp (Mul (exp1, exp2)) | exp DIV exp(Div (exp1, exp2)) | LPAR exp RPAR (exp) Higher precedence

Reduce-reduce Conflict  This kind of conflict is more difficult to deal with  Example  When we get a “word” from lexer,  word -> maybeword -> sequence (rule 1)  empty –> sequence word -> sequence (rule 2)  We have more than one way to get “sequence” from input “word” sequence::= | maybeword | sequence word maybeword: := | word

Reduce-reduce Conflict  Reduce-reduce conflict means there are two or more rules that apply to the same sequence of input. This usually indicates a serious error in the grammar.  ML-Yacc reduce by first rule  Generally, reduce-reduce conflict is not allowed in your ML-Yacc file  We need to fix our grammar sequence::= | sequence word

Summary of conflicts  Shift-reduce conflict  precedence and associativity  Shift by default  Reduce-reduce conflict  reduce by first rule  Not allowed!

Lab3  Your job is to finish a parser for C language  Input: A “.c” file  Output: “Success!” if the “.c” file is correct  File description  c.lex  c.grm  main.sml  call-main.sml  sources.cm  lab3.mlb  test.c

Using ML-Yacc  Read the ML-Yacc Manual  Run  If your finish “c.grm” and “c.lex”  In command-line: (use MLton’s)  mlyacc c.grm  mllex c.lex  we will get  “c.grm.sig”, “c.grm.sml”, “c.grm.desc”, “c.lex.sml”  Then compile Lab3  Start SML/NJ, Run CM.make “sources.cm”;  or in command-line, mlton lab3.mlb  To run lab3  In SML/NJ, Main.parse “test.c”;  or in command-line, lab3 test.c

“Debug” ML-Yacc File  When you run mlyacc, you’ll see error messages if your ml-yacc file has conflicts. For example,  mlyacc c.grm  2 shift/reduce conflicts  open file “c.grm.desc”(This file is generated by mlyacc)  The beginning of this file  the rest are all the states  rule 12 means the 12 th rule (from 0) in your ML-Yacc file 2 shift/reduce conflicts error: state 0: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12) error: state 1: shift/reduce conflict (shift MYSTRUCT, reduce by rule 12) state 0:prog :. structs vdecs preds funcs MYSTRUCTshift 3proggoto 429 structsgoto 2structdecgoto 1.reduce by rule 12

Use ML-lex with ML-yacc  Most of the work in “c.lex” this time can be copied from Lab2  You can re-use Regular expressions and Lexical rules  Difference with Lab2  You have to define “token” in “c.grm”  %term INT of int | EOF  “%term” in “c.grm” will be automatically in “c.grm.sig” signature C_TOKENS = sig type ('a,'b) token type svalue val EOF: 'a * 'a -> (svalue,'a) token val INT: (int) * 'a * 'a -> (svalue,'a) token end

Hints  Read ML-Yacc Manual  Read the language specification  Test a lot!