Compiler Principles Fall 2014-2015 Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.

Slides:



Advertisements
Similar presentations
JavaCUP JavaCUP (Construct Useful Parser) is a parser generator
Advertisements

Compiler construction in4020 – lecture 4 Koen Langendoen Delft University of Technology The Netherlands.
Compilation (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky 1 Slides credit: Roman Manevich, Mooly Sagiv, Eran Yahav.
Compiler Principles Fall Compiler Principles Lecture 4: Parsing part 3 Roman Manevich Ben-Gurion University.
Chapter 3 Syntax Analysis
Abstract Syntax Mooly Sagiv html:// 1.
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
Mooly Sagiv and Roman Manevich School of Computer Science
9/27/2006Prof. Hilfinger, Lecture 141 Syntax-Directed Translation Lecture 14 (adapted from slides by R. Bodik)
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
Recap Mooly Sagiv. Outline Subjects Studied Questions & Answers.
1 Chapter 5 Compilers Source Code (with macro) Macro Processor Expanded Code Compiler or Assembler obj.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
Context-Free Grammars Lecture 7
Compiler Construction Parsing Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter 2.2 (Partial) Hashlama 11:00-14:00.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Compiler Summary Mooly Sagiv html://
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Table-driven parsing Parsing performed by a finite state machine. Parsing algorithm is language-independent. FSM driven by table (s) generated automatically.
Chapter 2 A Simple Compiler
Compiler Construction Parsing II Ran Shaham and Ohad Shacham School of Computer Science Tel-Aviv University.
Parser construction tools: YACC
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
LEX and YACC work as a team
LR Parsing Compiler Baojian Hua
Automated Parser Generation (via CUP)CUP 1. High-level structure JFlexjavac Lexer spec Lexical analyzer text tokens.java CUPjavac Parser spec.javaParser.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Joey Paquet, Lecture 12 Review. Joey Paquet, Course Review Compiler architecture –Lexical analysis, syntactic analysis, semantic.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 8: Semantic Analysis and Symbol Tables.
Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter 2.2 (Partial) 1.
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
CPS 506 Comparative Programming Languages Syntax Specification.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
CS 614: Theory and Construction of Compilers Lecture 9 Fall 2002 Department of Computer Science University of Alabama Joel Jones.
CS 614: Theory and Construction of Compilers Lecture 7 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
1 JavaCUP JavaCup (Construct Useful Parser) is a parser generator; Produce a parser written in java, itself is also written in Java; There are many parser.
Compiler Principles Winter Compiler Principles Syntax Analysis (Parsing) – Part 3 Mayer Goldberg and Roman Manevich Ben-Gurion University.
Mooly Sagiv and Roman Manevich School of Computer Science
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
1 Introduction to Parsing. 2 Outline l Regular languages revisited l Parser overview Context-free grammars (CFG ’ s) l Derivations.
Syntax Analysis (chapter 4) SLR, LR1, LALR: Improving the parser From the previous examples: => LR0 parsing is rather weak. Cannot handle many languages.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
Bottom-up parsing. Bottom-up parsing builds a parse tree from the leaves (terminals) to the start symbol int E T * TE+ T (4) (2) (3) (5) (1) int*+ E 
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
Parser Generation Tools (Yacc and Bison) CS 471 September 24, 2007.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Compiler Principles Fall Compiler Principles Lecture 5: Parsing part 4 Roman Manevich Ben-Gurion University.
Announcements/Reading
CUP: An LALR Parser Generator for Java
Programming Languages Translator
Textbook:Modern Compiler Design
Java CUP.
Fall Compiler Principles Lecture 4: Parsing part 3
Bottom-Up Syntax Analysis
Fall Compiler Principles Lecture 4: Parsing part 3
CPSC 388 – Compiler Design and Construction
Bison Marcin Zubrowski.
Lecture 4: Syntax Analysis: Botom-Up Parsing Noam Rinetzky
Fall Compiler Principles Lecture 4: Parsing part 3
Presentation transcript:

Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University

Tentative syllabus Front End Scanning Top-down Parsing (LL) Bottom-up Parsing (LR) Attribute Grammars Intermediate Representation Lowering Optimizations Local Optimizations Dataflow Analysis Loop Optimizations Code Generation Register Allocation Instruction Selection 2 mid-termexam

Previously 3 Scanning LL parsing – Recursive descent – FIRST/FOLLOW/NULLABLE – Automata approach LR parsing – LR(0), SLR(0), LR(1), LALR(1)

Agenda 4 Automatic LR parser generation Handling ambiguities

5 Automated parser generation (via CUP)CUP

High-level structure JFlexjavac Lexer spec Lexical analyzer text tokens.java CUPjavac Parser spec.javaParser AST LANG.cup LANG.lex Parser.java sym.java Lexer.java (Token.java) 6

Expression calculator expr  expr + expr | expr - expr | expr * expr | expr / expr | - expr | ( expr ) | number Goals of expression calculator parser: Is a valid expression? What is the meaning (value) of this expression? 7

Syntax analysis with CUP CUPjavac Parser spec.javaParser AST CUP – parser generator Generates an LALR(1) Parser Input: spec file Output: a syntax analyzer Can dump automaton and table tokens 8

CUP spec file Package and import specifications User code components Symbol (terminal and non-terminal) lists – Terminals go to sym.java – Types of AST nodes Precedence declarations The grammar – Semantic actions to construct AST 9

10 Parsing ambiguous grammars

Expression Calculator – 1 st Attempt terminal Integer NUMBER; terminal PLUS, MINUS, MULT, DIV; terminal LPAREN, RPAREN; non terminal Integer expr; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr | LPAREN expr RPAREN | NUMBER ; Symbol type explained later 11

Ambiguities a + b * c a bc * + a bc + * a + b + c a bc + + a bc

Ambiguities as conflicts for LR(1) a + b + c a bc + + a bc a + b * c a bc * + a bc + *

terminal Integer NUMBER; terminal PLUS,MINUS,MULT,DIV; terminal LPAREN, RPAREN; terminal UMINUS; non terminal Integer expr; precedence left PLUS, MINUS; precedence left DIV, MULT; precedence left UMINUS; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; Expression Calculator – 2 nd Attempt Increasing precedence Contextual precedence 14

Parsing ambiguous grammars using precedence declarations Each terminal assigned with precedence – By default all terminals have lowest precedence – User can assign his own precedence – CUP assigns each production a precedence Precedence of rightmost terminal in production or user-specified contextual precedence On shift/reduce conflict resolve ambiguity by comparing precedence of terminal and production and decides whether to shift or reduce In case of equal precedences left / right help resolve conflicts – left means reduce – right means shift More information on precedence declarations in CUP’s manualprecedence declarations 15

Resolving ambiguity (associativity) a + b + c a bc + + a bc + + precedence left PLUS 16

Resolving ambiguity (op. precedence) a + b * c a bc * + a bc + * precedence left PLUS precedence left MULT 17

Resolving ambiguity (contextual) - a * b a b * - precedence left MULT MINUS expr %prec UMINUS a - b * 18

Resolving ambiguity terminal Integer NUMBER; terminal PLUS,MINUS,MULT,DIV; terminal LPAREN, RPAREN; terminal UMINUS; precedence left PLUS, MINUS; precedence left DIV, MULT; precedence left UMINUS; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; Rule has precedence of UMINUS UMINUS never returned by scanner (used only to define precedence) 19

More CUP directives precedence nonassoc NEQ – Non-associative operators: == != etc. – 1<2<3 identified as an error (semantic error?) start non-terminal – Specifies start non-terminal other than first non-terminal – Can change to test parts of grammar Getting internal representation – Command line options: -dump_grammar -dump_states -dump_tables -dump 20

import java_cup.runtime.*; % %cup %eofval{ return new Symbol(sym.EOF); %eofval} NUMBER=[0-9]+ % ”+” { return new Symbol(sym.PLUS); } ”-” { return new Symbol(sym.MINUS); } ”*” { return new Symbol(sym.MULT); } ”/” { return new Symbol(sym.DIV); } ”(” { return new Symbol(sym.LPAREN); } ”)” { return new Symbol(sym.RPAREN); } {NUMBER} { return new Symbol(sym.NUMBER, new Integer(yytext())); } \n { }. { } Parser gets terminals from the scanner Scanner integration Generated from token declarations in.cup file 21

Recap Package and import specifications and user code components Symbol (terminal and non-terminal) lists – Define building-blocks of the grammar Precedence declarations – May help resolve conflicts The grammar – May introduce conflicts that have to be resolved 22

23 Abstract syntax tree construction

Assigning meaning So far, only validation Add Java code implementing semantic actions expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; 24

Symbol labels used to name variables RESULT names the left-hand side symbol non terminal Integer expr; expr ::= expr:e1 PLUS expr:e2 {: RESULT = new Integer(e1.intValue() + e2.intValue()); :} | expr:e1 MINUS expr:e2 {: RESULT = new Integer(e1.intValue() - e2.intValue()); :} | expr:e1 MULT expr:e2 {: RESULT = new Integer(e1.intValue() * e2.intValue()); :} | expr:e1 DIV expr:e2 {: RESULT = new Integer(e1.intValue() / e2.intValue()); :} | MINUS expr:e1 {: RESULT = new Integer(0 - e1.intValue(); :} %prec UMINUS | LPAREN expr:e1 RPAREN {: RESULT = e1; :} | NUMBER:n {: RESULT = n; :} ; Assigning meaning 25

Abstract Syntax Trees More useful representation of syntax tree – Less clutter – Actual level of detail depends on your design Basis for semantic analysis Later annotated with various information – Type information – Computed values Technically – a class hierarchy of abstract syntax tree nodes 26

Parse tree vs. AST + expr 12+3 ()()

AST hierarchy example 28 int_constplusminustimesdivide expr

AST construction AST Nodes constructed during parsing – Stored in push-down stack Bottom-up parser – Grammar rules annotated with actions for AST construction – When node is constructed all children available (already constructed) – Node (RESULT) pushed on stack 29

1 + (2) + (3) expr + (expr) + (3) + expr 12+3 expr + (3) expr ()() expr + (expr) expr expr + (2) + (3) int_const val = 1 plus e1e2 int_const val = 2 int_const val = 3 plus e1e2 expr ::= expr:e1 PLUS expr:e2 {: RESULT = new plus(e1,e2); :} | LPAREN expr:e RPAREN {: RESULT = e; :} | INT_CONST:i {: RESULT = new int_const(…, i); :} AST construction 30

terminal Integer NUMBER; terminal PLUS,MINUS,MULT,DIV,LPAREN,RPAREN,SEMI; terminal UMINUS; non terminal Integer expr; non terminal expr_list, expr_part; precedence left PLUS, MINUS; precedence left DIV, MULT; precedence left UMINUS; expr_list ::= expr_list expr_part | expr_part ; expr_part ::= expr:e {: System.out.println("= " + e); :} SEMI ; expr ::= expr PLUS expr | expr MINUS expr | expr MULT expr | expr DIV expr | MINUS expr %prec UMINUS | LPAREN expr RPAREN | NUMBER ; Example of lists 31 Executed when e is shifted

Midterm materials Loaded exercises with solutions to web-site 32

Next lecture: Intermediate Representation