241-437 Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use 241-437, Semester 1, 2011-2012.

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

Structure of a YACC File Has the same three-part structure as Lex Each part is separated by a % symbol The three parts are even identical: – definition.
Abstract Syntax Mooly Sagiv html:// 1.
Yacc YACC BNF grammar example.y Other modules example.tab.c Executable
Lecture 10 YACC – Yet Another Compiler Compiler Introduction to YACC and Bison Topics Yacc/Bison IntroductionReadings: February 13, 2006 CSCE 531 Compiler.
Yacc Examples Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Bottom-Up Syntax Analysis Mooly Sagiv Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Design Chapter
CS 310 – Fall 2006 Pacific University CS310 Lex & Yacc Today’s reference: UNIX Programming Tools: lex & yacc by: Levine, Mason, Brown Chapter 1, 2, 3 November.
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv & Greta Yorsh Textbook:Modern Compiler Design Chapter (modified)
Bottom-Up Syntax Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 3.
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Saumya Debray The University of Arizona Tucson, AZ 85721
LEX and YACC work as a team
Compilers: Attr. Grammars/8 1 Compiler Structures Objective – –describe semantic analysis with attribute grammars, as applied in yacc and recursive.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Using the LALR Parser Generator yacc By J. H. Wang May 10, 2011.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
CS308 Compiler Principles Introduction to Yacc Fan Wu Department of Computer Science and Engineering Shanghai Jiao Tong University.
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Lex.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Syntactic Analysis Tools
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
Compiler Principles Fall Compiler Principles Lecture 6: Parsing part 5 Roman Manevich Ben-Gurion University.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
Introduction to YACC CS 540 George Mason University.
1 Programming Languages (CS 550) Lecture 1 Summary Grammars and Parsing Jeremy R. Johnson.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
2-1. LEX & YACC. 2 Overview  Syntax  What its program looks like –Context-free grammar, BNF  Syntax-directed translation –A grammar-oriented compiling.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
Parser Generation Tools (Yacc and Bison) CS 471 September 24, 2007.
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
YACC SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
CS 310 – Fall 2008 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
Syntax error handling –Errors can occur at many levels lexical: unknown operator syntactic: unbalanced parentheses semantic: variable never declared runtime:
Syntax Analysis Part III
Compiler Construction
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University
Chapter 4 Syntax Analysis.
Context-free Languages
Syntax Analysis Part III
Bison: Parser Generator
Syntax Analysis Part III
Bison Marcin Zubrowski.
Syntax Analysis Part III
Syntax Analysis Part III
Compiler Structures 3. Lex Objectives , Semester 2,
Compiler Lecture Note, Miscellaneous
Yacc Yacc.
Compiler Structures 7. Yacc Objectives , Semester 2,
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Saumya Debray The University of Arizona Tucson, AZ 85721
Compiler Design Yacc Example "Yet Another Compiler Compiler"
CMPE 152: Compiler Design December 4 Class Meeting
Presentation transcript:

Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1, Yacc

Compilers: Yacc/7 2 Overview 1. What is Yacc? 2. Format of a yacc/bison File 3. Expressions Compiler 4. Bottom-up Parsing Reminder 5. Expression Conflicts 6.Precedence/Associativity in yacc continued

Compilers: Yacc/7 3 7.Dangling Else Conflict 8.Left and Right Recursion 9.Error Recovery 10.Embedded Actions 11.More Information

Compilers: Yacc/ What is Yacc? Yacc (Yet Another Compiler Compiler) is a tool for translating a context free grammar into a bottom-up LALR parser – –it creates a parse table like that described in the last chapter Yacc is used with lex to create compilers. continued

Compilers: Yacc/7 5 Most people use bison, a much improved version of yacc – –on most modern Unixes, when you call yacc, you're really using bison bison works with flex (the fast version of lex).

Compilers: Yacc/7 6 Bison and Flex $ flex foo.l $ bison foo.y $ gcc foo.tab.c -o foo foo.l, a flex file foo.y, a bison file bison flexlex.yy.c foo.tab.c C compiler foo, c executable #include foo, c executable source program parsed output $./foo < program.txt

Compilers: Yacc/7 7 Compiler Components (in foo) lex.yy.c, Lexical Analyzer (using chars) foo.tab.c, Syntax Analyzer (using tokens) Source Program 3. Token, token value, token type 1. Get next token by calling yylex() lexical errors syntax errors 2. Get chars to make a token parsed output

Compilers: Yacc/7 8 actionsgotos Inside foo.tab.c$ anananan… aiaiaiai… a2a2a2a2 a1a1a1a1 LALR Parser X o s 0 X o s 0 … X m-1 s m-1 X m s m X m s m parsed output stack input tokens X is terminals or non-terminals, S = state Parse table (bison creates this based on your grammar)

Compilers: Yacc/ Format of a yacc/bison File declarations: C data and yacc definitions (or nothing) % Grammar rules (with actions) % #include "lex.yy.c" C functions, including main()

Compilers: Yacc/7 10 Declarations C data is put between %{ and %} The yacc definitions list the tokens (terminals) used in the grammar %token terminal1 terminal2... Other yacc definitions: – –%left and %right for associativity – –%prec for precedence

Compilers: Yacc/7 11 v v Precedence example: * 5 – –does it mean (2 + 3) * 5 or 2 + (3 * 5) ? v v Associativity example: 1 – 1 – 1 – –does it mean (1 – 1) – 1// left or 1 – (1 – 1) ?// right

Compilers: Yacc/7 12 Rules Rule format: nonterminal : body 1 {action 1} | body 2 {action 2}... | body n {action n) ; Actions are optional; they are C code. Actions are usually at the end of a body, but can be placed anywhere. grammar part is the same as: nonterminal  body1 | body2 |... | bodyN

Compilers: Yacc/ Expressions Compiler $ flex expr.l $ bison expr.y $ gcc expr.tab.c -o exprEval expr.l, a flex file expr.y, a bison file bison flexlex.yy.c expr.tab.c gcc exprEval, c executable #include

Compilers: Yacc/7 14 Usage $./exprEval Value = (5 * 2) Value = -8 1 / 3 Value = 0 $ I typed these lines. I typed ctrl-D

Compilers: Yacc/7 15 expr.l % [-+*/()\n]{ return *yytext; } [0-9]* { yylval = atoi(yytext); return(NUMBER); } [ \t] ; /* skip whitespace */ % int yywrap(void) { return 1; } No main() function RE actions usually end with a return. The token is passed to the syntax analyser.

Compilers: Yacc/7 16 Lex File Format Reminder A lex program has three sections: REs and/or C code % RE/action rules % C functions

Compilers: Yacc/7 17 expr.y %token NUMBER % exprs: expr '\n' { printf("Value = %d\n", $1); } | exprs expr '\n' { printf("Value = %d\n", $2); } ; expr: expr '+' term { $$ = $1 + $3; } | expr '-' term { $$ = $1 - $3; } | term { $$ = $1; } ; continued declarations rules attributes

Compilers: Yacc/7 18 term: term '*' factor { $$ = $1 * $3; } | term '/' factor{ $$ = $1 / $3; } /* integer division */ | factor ; factor: '(' expr ')' { $$ = $2; } | NUMBER ; continued more rules

Compilers: Yacc/7 19 $$ #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); // the syntax analyzer return 0; } c code

Compilers: Yacc/7 20 Yacc Actions yacc actions (the C code) can use attributes (variables). Each body terminal/non-terminal has an attribute, which contains it's return value.

Compilers: Yacc/7 21 Attributes An attribute is $n, where n is the position of the terminal/non-terminal in the body starting at 1 – –$1 = first terminal/non-terminal of the body – –$2 = second one – –etc. – –$$ = return value for the rule the default value for $$ is the $1 value

Compilers: Yacc/7 22 Evaluation in yacc Stack $ $ 3 $ F $ T $ T * $ T * 5 $ T * F $ T $ E $ E + $ E + 4 $ E + F $ E + T $ E $ E \n $ Es Input 3*5+4\n$ *5+4\n$ *5+4\n$ *5+4\n$ 5+4\n$ +4\n$ +4\n$ +4\n$ +4\n$ 4\n$ \n$ \n$ \n$ \n$ $ $ Action shift reduce F  num reduce T  F shift shift reduce F  num reduce T  T * F reduce E  T shift shift reduce F  num reduce T  F reduce E  E + T shift reduce Es  E \n accept val _ Rule $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 * $3 $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 + $3 printf $1 Input: 3 * 5 + 4\n

Compilers: Yacc/ Bottom-up Parsing Reminder Simple expressions grammar: E => E '+' E// rule r1 E => E '*' E// rule r2 E => id// rule r3

Compilers: Yacc/7 24 Parsing "x + y * z" x + y * z // shift x. + y * z // reduce(r3) E. + y * z // shift E +. y * z // shift E + y. * z // reduce(r3) E + E. * z // shift E + E *. z // shift E + E * z. // reduce(r3) E + E * E. // reduce(r2) E + E. // reduce(r1) E. // accept

Compilers: Yacc/7 25 Shift/Reduce Conflict At step 6, a shift or a reduce is possible. 6. E + E. * z // reduce (r1) 7. E. * z : What should be done? – –by default, yacc (bison) shifts

Compilers: Yacc/7 26 Reduce/Reduce Conflict Modify the grammar to include: E => T// new rule r3 E => id// rule r4 T => id// rule r5 continued

Compilers: Yacc/7 27 Consider step 2: x. + y * z There are two ways to reduce: E. + y * z // reduce (r4) or T. + y * z // reduce (r5) What should be done? – –by default, yacc (bison) reduces using the first possible rule (i.e. rule r4)

Compilers: Yacc/7 28 Common Conflicts The two most common shift/reduce problems in prog. languages are: – –expression precedence – –dangling else yacc has features for fixing both of these Reduce/reduce problems are usually due to errors in your grammar.

Compilers: Yacc/7 29 Debugging Conflicts bison can generate extra conflict information, which can help you debug your grammar. – –use the -v option

Compilers: Yacc/ Expression Conflicts %token NUMBER % expr: expr '+' expr | expr '*' expr | '(' expr ')' | NUMBER ; in shiftE.y continued shift/reduce here, as in previous example

Compilers: Yacc/7 31 % #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); return 0; }

Compilers: Yacc/7 32 Example When the parsing state is: expr '+' expr. '*' z should bison shift: expr '+' expr '*'. z or reduce?: expr. '*' z // using rule 1

Compilers: Yacc/7 33 Using -v $ bison shiftE.y shiftE.y: conflicts: 4 shift/reduce $ bison -v shiftE.y shiftE.y: conflicts: 4 shift/reduce – –creates a shiftE.output file with extra conflict information

Compilers: Yacc/7 34 Inside shiftE.output State 9 conflicts: 2 shift/reduce State 10 conflicts: 2 shift/reduce Grammar 0 $accept: expr $end 1 expr: expr '+' expr 2 | expr '*' expr 3 | '(' expr ')' 4 | NUMBER : // many state blocks states 9 and 10 are the problems the rules are numbered continued

Compilers: Yacc/7 35 state 9 1 expr: expr. '+' expr 1 | expr '+' expr. 2 | expr. '*' expr '+' shift, and go to state 6 '*' shift, and go to state 7 '+' [reduce using rule 1 (expr)] '*' [reduce using rule 1 (expr)] $default reduce using rule 1 (expr) bison does this but it could do this when bison is looking at these kinds of parsing states continued

Compilers: Yacc/7 36 state 10 1 expr: expr. '+' expr 2 | expr. '*' expr 2 | expr '*' expr. '+' shift, and go to state 6 '*' shift, and go to state 7 '+' [reduce using rule 2 (expr)] '*' [reduce using rule 2 (expr)] $default reduce using rule 2 (expr) bison does this but it could do this when bison is looking at these kinds of parsing states

Compilers: Yacc/7 37 What causes Expression Conflicts? The problems are the precedence and associativity of the operators: – –does * 5 mean (2 + 3) * 5 or 2 + (3 * 5) ? // should be 2nd – –does mean (1 - 1) - 1 or 1 - (1 - 1) ? // should be 1st * should have higher precedence than +, and – should be left associative.

Compilers: Yacc/ Precedence/Associativity in yacc The declarations section can contain associativity and precedence settings for tokens: – –%left, %right, %nonassoc – –precedence is given by the order of the lines Example: %left '+' '-' %left '*' '/' All left associative, with '*' and '/' higher precedence than '+' and '-'.

Compilers: Yacc/7 39 Expressions Variables Compiler $ flex exprVars.l $ bison exprVars.y $ gcc exprVars.tab.c -o exprVarsEval exprVars.l, a flex file exprVars.y, a bison file bison flexlex.yy.c exprVars.tab.c gcc exprVarsEval, c executable #include

Compilers: Yacc/7 40 Usage $./exprVarsEval * 3 Value = Value = -1 a = 3 * 4 a Value = 12 b = (3 - 6) * a b Value = -36 $ I typed these lines. I typed ctrl-D

Compilers: Yacc/7 41 exprVars.l /* Added: RE vars, token names, VAR token, assignment, error msgs */ digits [0-9]+ letter [a-z] % \n return('\n'); \= return(ASSIGN); \+ return(PLUS); \-return(MINUS); \* return(TIMES); \/return(DIV); \( return(LPAREN); \) return(RPAREN); continued the token names are defined in the yacc file

Compilers: Yacc/7 42 {letter} { yylval = *yytext - 'a'; return(VAR); } {digits} { yylval = atoi(yytext); return(NUMBER); } [ \t] ; /* skip whitespace */. yyerror("Invalid char"); /* reject everything else */ % int yywrap(void) { return 1; }

Compilers: Yacc/7 43 exprVars.y /* Added: token names, assoc/precedence ops, changed grammar rules, vars and assignment. */ %token VAR NUMBER ASSIGN PLUS MINUS TIMES DIV LPAREN RPAREN %left PLUS MINUS %left TIMES DIV %{ int symbol[26]; // stores var's values %} % continued

Compilers: Yacc/7 44 program: program statement '\n' | ; statement: expr { printf("Value = %d\n", $1); } | VAR ASSIGN expr { symbol[$1] = $3; } expr: NUMBER | VAR { $$ = symbol[$1]; } | expr PLUS expr { $$ = $1 + $3; } | expr MINUS expr { $$ = $1 - $3; } | expr TIMES expr { $$ = $1 * $3; } | expr DIV expr { $$ = $1 / $3; } /* integer division */ | LPAREN expr RPAREN { $$ = $2; } ; % continued

Compilers: Yacc/7 45 #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); return 0; }

Compilers: Yacc/ Dangling Else Conflict %token IF ELSE variable % stmt: expr | if_stmt ; if_stmt: IF expr stmt | IF expr stmt ELSE stmt ; expr: variable ; in iffy.y $ bison -v iffy.y iffy.y: conflicts: 1 shift/reduce

Compilers: Yacc/7 47 Shift or Reduce? Current state: – –IF expr IF expr stmt. ELSE stmt Shift choice: – –IF expr IF expr stmt. ELSE stmt – –IF expr IF expr stmt ELSE. stmt – –IF expr IF expr stmt ELSE stmt. – –IF expr stmt. the second ELSE is paired with the second IF continued if (x < 5) if (x < 3) y = a – b; else y = b – a;

Compilers: Yacc/7 48 Reduce option: – –IF expr IF expr stmt. ELSE stmt – –IF expr stmt. ELSE stmt – –IF expr stmt ELSE. stmt – –IF expr stmt ELSE stmt. the second ELSE is paired with the first IF if (x < 5) if (x < 3) y = a – b; else y = b – a;

Compilers: Yacc/7 49 Inside iffy.output State 8 conflicts: 1 shift/reduce Grammar 0 $accept: stmt $end 1 stmt: expr 2 | if_stmt 3 if_stmt: IF expr stmt 4 | IF expr stmt ELSE stmt 5 expr: variable : // many state blocks continued

Compilers: Yacc/7 50 state 8 3 if_stmt: IF expr stmt. 4 | IF expr stmt. ELSE stmt ELSE shift, and go to state 9 ELSE [reduce using rule 3 (if_stmt)] $default reduce using rule 3 (if_stmt) bison does this but it could do this when bison is looking at these kinds of parsing states

Compilers: Yacc/ Left and Right Recursion A left recursive rule: list: item | list ',' item ; A right recursion rule: list: item | item ',' list Left recusion keeps the parse table stack smaller, so may be a better choice this is the opposite of top-down

Compilers: Yacc/ Error Recovery When an error occurs, yacc/bison calls yyerror() and then terminates. A better approach is to call yyerror(), then try to continue – –this can be done by using the keyword error in the grammar rules

Compilers: Yacc/7 53 Example If there's an error in the stmt rule, then skip the rest of the input tokens until ';'" or '}' is seen, then continue as before: stmt: ';' | expr ';' | VAR '=' expr ';' | '{' stmt_list '}' | error ';' | error '}' ;

Compilers: Yacc/ Embedded Actions Actions can be placed anywhere in a rule, not just at the end: listPair: item1 { do_item1($1); } item2 { do_item2($3); } – –the action variable in the second action block is $3 since the first action block is counted as part of the rule

Compilers: Yacc/ More Information Lex and Yacc by Levine, Mason, and Brown O'Reilly; 2nd edition On UNIX: – –man yacc – –info yacc continued in our library

Compilers: Yacc/7 56 A Compact Guide to Lex & Yacc by Tom Niemann – –with several yacc calculator examples, which I'll be discussing in the next few chapters The Lex & Yacc Page – –documentation and tools continued

Compilers: Yacc/7 57 Compiler Construction using Flex and Bison by Anthony A. Aaby Compiler Construction using Flex and Bison by Anthony A. Aaby –in the "Useful Info" subdirectory of the course website