CS308 Compiler Principles Introduction to Yacc Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University.

Slides:



Advertisements
Similar presentations
Yacc YACC BNF grammar example.y Other modules example.tab.c Executable
Advertisements

176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Cse321, Programming Languages and Compilers 1 6/12/2015 Lecture #10, Feb. 14, 2007 Modified sets of item construction Rules for building LR parse tables.
6/12/2015Prof. Hilfinger CS164 Lecture 111 Bottom-Up Parsing Lecture (From slides by G. Necula & R. Bodik)
1 Chapter 5: Bottom-Up Parsing (Shift-Reduce). 2 - attempts to construct a parse tree for an input string beginning at the leaves (the bottom) and working.
1 Contents Introduction A Simple Compiler Scanning – Theory and Practice Grammars and Parsing LL(1) Parsing LR Parsing Lex and yacc Semantic Processing.
Yacc Examples Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
ML-YACC David Walker COS 320. Outline Last Week –Introduction to Lexing, CFGs, and Parsing Today: –More parsing: automatic parser generation via ML-Yacc.
ISBN Chapter 4 Lexical and Syntax Analysis The Parsing Problem Recursive-Descent Parsing.
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
Saumya Debray The University of Arizona Tucson, AZ 85721
LEX and YACC work as a team
1 Using Yacc: Part II. 2 Main() ? How do I activate the parser generated by yacc in the main() –See mglyac.y.
Using the LALR Parser Generator yacc By J. H. Wang May 10, 2011.
1 October 14, October 14, 2015October 14, 2015October 14, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Lab 3: Using ML-Yacc Zhong Zhuang
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
Lexical and Syntax Analysis
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Chapter 5: Bottom-Up Parsing (Shift-Reduce)
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Introduction to Yacc Ying-Hung Jiang
Introduction to Lex Fan Wu
Prof. Necula CS 164 Lecture 8-91 Bottom-Up Parsing LR Parsing. Parser Generators. Lecture 6.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
國立台灣大學 資訊工程學系 薛智文 98 Spring Syntax-Directed Translation (textbook ch#5.1–5.6, 4.8, 4.9 )
Introduction to YACC CS 540 George Mason University.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
Bottom Up Parsing CS 671 January 31, CS 671 – Spring Where Are We? Finished Top-Down Parsing Starting Bottom-Up Parsing Lexical Analysis.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
Eliminating Left-Recursion Where some of a nonterminal’s productions are left-recursive, top-down parsing is not possible “Immediate” left-recursion can.
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
YACC SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Syntax error handling –Errors can occur at many levels lexical: unknown operator syntactic: unbalanced parentheses semantic: variable never declared runtime:
Syntax Analysis Part III
Tutorial On Lex & Yacc.
Compiler Construction
Chapter 4 Syntax Analysis.
Syntax Analysis Part III
Lexical and Syntax Analysis
Syntax Analysis Part III
Bison Marcin Zubrowski.
Syntax Analysis Part III
Subject Name:Sysytem Software Subject Code: 10SCS52
Syntax Analysis Part III
Syntax-Directed Translation
Compiler Lecture Note, Miscellaneous
Yacc Yacc.
Compiler Structures 7. Yacc Objectives , Semester 2,
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Saumya Debray The University of Arizona Tucson, AZ 85721
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CMPE 152: Compiler Design December 4 Class Meeting
Presentation transcript:

CS308 Compiler Principles Introduction to Yacc Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University

Compiler Principles Introduction to Yacc Yacc: yet another compiler compiler. An LALR parser generator. Yacc generates –Tables – according to the grammar rules. –Driver routines – in C programming language. –y.output – a report file.

Compiler Principles Using Yacc Together with Lex flexyacc yylex()yyparse() lexical rulesgrammar rules input tokensparsed input lex.yy.cy.tab.c “yacc -d”  y.tab.h describes states, transitions of parser (useful for debugging) y.output “yacc -v”

Compiler Principles Cooperate with lex Invokes yylex() automatically. Generate y.tab.h file through the -d option. The lex input file must contains y.tab.h For each token that lex recognized, a number is returned (from yylex() function.)

Compiler Principles Writing Yacc Input File A Yacc input file consists of three sections: –Definition –Rules –User code Separate by % definitions % rules % user code optional required

Compiler Principles Definition Section C source code, include files, etc. %token Yacc invokes yylex() when a token is required. All terminal symbols should be declared through %token. Yacc produces y.tab.h by %token definitions.

Compiler Principles Definitions Information about tokens: –token names: declared using ‘%token’ single-character tokens don’t have to be declared any name not declared as a token is assumed to be a nonterminal. –start symbol of grammar, using ‘%start’ [optional] –operator info: precedence, associativity –stuff to be copied verbatim into the output (e.g., declarations, #includes): enclosed in %{ … }%

Compiler Principles Rules Rule RHS can have arbitrary C code embedded, within { … }. E.g.: A : B1 { printf(“after B1\n”); x = 0; } B2 { x++; } B3 Left-recursion more efficient than right- recursion: A : A x | … rather than A : x A | … Grammar productionyacc rule A  B 1 B 2 … B m A  C 1 C 2 … C n A  D 1 D 2 … D k A  B 1 B 2 … B m | C 1 C 2 … C n | D 1 D 2 … D k ; /* ‘;’ optional, but advised */

Compiler Principles Conflicts Conflicts arise when there is more than one way to proceed with parsing. Two types: –shift-reduce [default action: shift] –reduce-reduce [default: reduce with the first rule listed] Removing conflicts: –specify operator precedence, associativity; –restructure the grammar use y.output to identify reasons for the conflict.

Compiler Principles Specifying Operator Properties Binary operators: %left, %right, %nonassoc : %left '+' '-' %left '*' '/' %right '^‘ Unary operators: %prec –Changes the precedence of a rule to be that of the token specified. E.g.: %left '+' '-' %left '*' '/‘ Expr: expr ‘+’ expr | ‘–’ expr %prec ‘*’ | … Operators in the same group have the same precedence

Compiler Principles Specifying Operator Properties Binary operators: %left, %right, %nonassoc : %left '+' '-' %left '*' '/' %right '^' Unary operators: %prec –Changes the precedence of a rule to be that of the token specified. E.g.: %left '+' '-' %left '*' '/' Expr: expr ‘+’ expr | ‘–’ expr %prec ‘*’ | … Operators in the same group have the same precedence Across groups, precedence increases going down.

Compiler Principles Specifying Operator Properties Binary operators: %left, %right, %nonassoc : %left '+' '-' %left '*' '/' %right '^‘ Unary operators: %prec –Changes the precedence of a rule to be that of the token specified. E.g.: %left '+' '-' %left '*' '/' Expr: expr '+' expr | '–' expr %prec '*' | … Operators in the same group have the same precedence Across groups, precedence increases going down. The rule for unary ‘–’ has the same (high) precedence as ‘*’

Compiler Principles Error Handling The “token” ‘error’ is reserved for error handling: –can be used in rules; –suggests places where errors might be detected and recovery can occur. Example: stmt : IF '(' expr ')' stmt | IF '(' error ')' stmt | FOR … | … Intended to recover from errors in ‘expr’

Compiler Principles Parser Behavior on Errors When an error occurs, the parser: –pops its stack until it enters a state where the token ‘error’ is legal; –then behaves as if it saw the token ‘error’ performs the action encountered; resets the lookahead token to the token that caused the error. –If no ‘error’ rules specified, processing halts.

Compiler Principles Controlling Error Behavior Parser remains in error state until three tokens successfully read in and shifted –prevents cascaded error messages; –if an error is detected while parser in error state: no error message given; input token causing the error is deleted. To force the parser to believe that an error has been fully recovered from: yyerrok; To clear the token that caused the error: yyclearin;

Compiler Principles Placing ‘error’ tokens Some guidelines: Close to the start symbol of the grammar: –To allow recovery without discarding all input. Near terminal symbols: –To allow only a small amount of input to be discarded on an error. –Consider tokens like ‘)’, ‘;’, that follow nonterminals. Without introducing conflicts.

Compiler Principles Error Messages On finding an error, the parser calls a function void yyerror(char *s) /* s points to an error msg */ –user-supplied, prints out error message. More informative error messages: –int yychar: token no. of token causing the error. –user program keeps track of line numbers, as well as any additional info desired.

Compiler Principles Error Messages: example #include "y.tab.h" extern int yychar, curr_line; static void print_tok() { if (yychar < 255) { fprintf(stderr, "%c", yychar); } else { switch (yychar) { case ID: … case INTCON: … … } void yyerror(char *s) { fprintf(stderr, "[line %d]: %s", curr_line, s); print_tok(); }

Compiler Principles Debugging the Parser To trace the shift/reduce actions of the parser: –when compiling: #define YYDEBUG –at runtime: set yydebug = 1 /* extern int yydebug; */

Compiler Principles Adding Semantic Actions Semantic actions for a rule are placed in its body: –an action consists of C code enclosed in { … } –may be placed anywhere in rule RHS Example: expr : ID { symTbl_lookup(idname); } decl : type_name { tval = … } id_list;

Compiler Principles Synthesized Attributes Each nonterminal can “return” a value: –The return value for a nonterminal X is “returned” to a rule that has X in its body, e.g.: A : … X … X : … –This is different from the values returned by the scanner to the parser! value “returned” by X

Compiler Principles Attribute return values To access the value returned by the i th symbol in a rule body, use $i –an action occurring in a rule body counts as a symbol. E.g.: decl : type { tval = $1 } id_list { symtbl_install($3, tval); } To set the value to be returned by a rule, assign to $$ –by default, the value of a rule is the value of its first symbol, i.e., $

Compiler Principles Example var_decl : ident opt_subscript { if ( symtbl_lookup($1) != NULL ) ErrMsg(“multiple declarations”, $1); else { st_entry = symtbl_install($1); if ( $2 == ARRAY ) { st_entry->base_type = ARRAY; st_entry->element_type = tval; } else { st_entry->base_type = tval; st_entry->element_type = UNDEF; } opt_subscript : ‘[‘ INTCON ‘]’ { $$ = ARRAY; } | /* null */ { $$ = INTEGER; } ; /* A variable declaration is an identifier followed by an optional subscript, e.g., x or x[10] */

Compiler Principles Declaring Return Value Types Default type for nonterminal return values is int. Need to declare return value types if nonterminal return values can be of other types: –Declare the union of the different types that may be returned: %union { symtbl_ptr st_ptr; id_list_ptr list_of_ids; tree_node_ptr syntax_tree_ptr; int value; } –Specify which union member a particular grammar symbol will return: %token INTCON, CHARCON; %type identifier; %type expr, stmt; terminals nonterminals

Compiler Principles Conflicts A conflict occurs when the parser has multiple possible actions in some state for a given next token. Two kinds of conflicts: –shift-reduce conflict: The parser can either keep reading more of the input (“shift action”), or it can mimic a derivation step using the input it has read already (“reduce action”). –reduce-reduce conflict: There is more than one production that can be used for mimicking a derivation step at that point.

Compiler Principles Example of a conflict Grammar rules: S  if ( e ) S /* 1 */ Input: if ( e 1 ) if ( e 2 ) S 2 else S 3 | if ( e ) S else S /* 2 */ Parser state when input token = ‘else’: –Input already seen: if ( e 1 ) if ( e 2 ) S 2 –Choices for continuing: 1. keep reading input (“shift”): ‘else’ part of innermost if eventual parse structure: if (e 1 ) { if (e 2 ) S 2 else S 3 } 2. mimic derivation step using S  if ( e ) S (“reduce”): ‘else’ part of outermost if eventual parse structure: if (e 1 ) { if (e 2 ) S 2 } else S 3 shift-reduce conflict

Compiler Principles Handling Conflicts General approach: Iterate as necessary: 1.Use “yacc -v” to generate the file y.output. 2.Examine y.output to find parser states with conflicts. 3.For each such state, examine the items to figure why the conflict is occurring. 4.Transform the grammar to eliminate the conflict : Reason for conflictPossible grammar transformation Ambiguity with operators in expressionsSpecify associativity, precedence Error actionMove or eliminate offending error action Semantic actionMove the offending semantic action Insufficient lookahead“expand out” the nonterminal involved Other…???…

Compiler Principles Resources Yacc Manual: Doug Brown, John Levine, and Tony Mason, “lex & yacc”, second edition, O'Reilly. Thomas Niemann, “A Compact Guide to Lex & Yacc”. Lex/Yacc Win32 port: yacc/lex-yacc.html yacc/lex-yacc.html Parser Generator: