Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compiler Structures 7. Yacc Objectives , Semester 2,

Similar presentations


Presentation on theme: "Compiler Structures 7. Yacc Objectives , Semester 2,"— Presentation transcript:

1 Compiler Structures 7. Yacc Objectives 242-437, Semester 2, 2018-2019
describe yacc (actually bison) give simple examples of its use

2 Overview 1. What is Yacc? 2. Format of a yacc/bison File 3. Expressions Compiler 4. Bottom-up Parsing Reminder 5. Expression Conflicts 6. Precedence/Associativity in yacc continued

3 7. Dangling Else Conflict 8. Left and Right Recursion 9
7. Dangling Else Conflict 8. Left and Right Recursion 9. Error Recovery 10. Embedded Actions 11. More Information

4 1. What is Yacc? Yacc (Yet Another Compiler Compiler) is a tool for translating a context free grammar into a bottom-up LALR parser it creates a parse table like that described in the last chapter Yacc is used with lex to create compilers. continued

5 Most people use bison, a much improved version of yacc
on most modern Unixes, when you call yacc, you're really using bison bison works with flex (the fast version of lex).

6 Bison and Flex foo.l, a flex file flex lex.yy.c foo, c executable
$ bison foo.y $ gcc foo.tab.c -o foo foo.l, a flex file flex lex.yy.c foo, c executable C compiler #include foo.y, a bison file bison foo.tab.c foo, c executable source program parsed output $ ./foo < program.txt

7 Compiler Components (in foo)
3. Token, token value, token type lex.yy.c, Lexical Analyzer (using chars) foo.tab.c, Syntax Analyzer (using tokens) Source Program parsed output 1. Get next token by calling yylex() 2. Get chars to make a token lexical errors syntax errors

8 Inside foo.tab.c LALR Parser input tokens $ an … ai a2 a1 stack Xm sm
parsed output Xm-1 sm-1 Parse table (bison creates this based on your grammar) Xo s0 actions gotos X is terminals or non-terminals, S = state

9 2. Format of a yacc/bison File
declarations: C data and yacc definitions (or nothing) %% Grammar rules (with actions) #include "lex.yy.c" C functions, including main()

10 Declarations C data is put between %{ and %}
The yacc definitions list the tokens (terminals) used in the grammar %token terminal1 terminal2 ... Other yacc definitions: %left and %right for associativity %prec for precedence

11 Precedence example: 2 + 3 * 5
does it mean (2 + 3) * 5 or 2 + (3 * 5) ? Associativity example: 1 – 1 – 1 does it mean (1 – 1) – 1 // left or 1 – (1 – 1) ? // right

12 Rules Rule format: Actions are optional; they are C code.
grammar part is the same as: nonterminal  body1 | body2 | ... | bodyN Rules Rule format: nonterminal : body 1 {action 1} | body 2 {action 2} . . . | body n {action n) ; Actions are optional; they are C code. Actions are usually at the end of a body, but can be placed anywhere.

13 3. Expressions Compiler expr.l, a flex file flex lex.yy.c exprEval,
c executable #include gcc expr.y, a bison file bison expr.tab.c $ flex expr.l $ bison expr.y $ gcc expr.tab.c -o exprEval

14 Usage I typed these lines. I typed ctrl-D
$ ./exprEval Value = (5 * 2) Value = -8 1 / 3 Value = 0 $ I typed these lines. I typed ctrl-D

15 expr.l RE actions usually end with a return. The token is passed
to the syntax analyser. %% [-+*/()\n] { return *yytext; } [0-9]* { yylval = atoi(yytext); return(NUMBER); } [ \t] ; /* skip whitespace */ int yywrap(void) { return 1; } No main() function

16 Lex File Format Reminder
A lex program has three sections: REs and/or C code %% RE/action rules %% C functions

17 expr.y declarations attributes rules continued
%token NUMBER %% exprs: expr '\n' { printf("Value = %d\n", $1); } | exprs expr '\n' { printf("Value = %d\n", $2); } ; expr: expr '+' term { $$ = $1 + $3; } | expr '-' term { $$ = $1 - $3; } | term { $$ = $1; } declarations attributes rules continued

18 more rules continued term: term '*' factor { $$ = $1 * $3; }
| term '/' factor { $$ = $1 / $3; } /* integer division */ | factor ; factor: '(' expr ')' { $$ = $2; } | NUMBER more rules continued

19 $$ #include "lex. yy. c" int yyerror(char
$$ #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); // the syntax analyzer c code

20 Yacc Actions yacc actions (the C code) can use attributes (variables).
Each body terminal/non-terminal has an attribute, which contains it's return value.

21 Attributes An attribute is $n, where n is the position of the terminal/non-terminal in the body starting at 1 $1 = first terminal/non-terminal of the body $2 = second one etc. $$ = return value for the rule the default value for $$ is the $1 value

22 Evaluation in yacc Input: 3 * 5 + 4\n Stack
$ $ 3 $ F $ T $ T * $ T * 5 $ T * F $ T $ E $ E + $ E + 4 $ E + F $ E + T $ E $ E \n $ Es val _ Input 3*5+4\n$ *5+4\n$ *5+4\n$ *5+4\n$ 5+4\n$ +4\n$ +4\n$ +4\n$ +4\n$ 4\n$ \n$ \n$ \n$ \n$ $ $ Action shift reduce F  num reduce T  F shift shift reduce F  num reduce T  T * F reduce E  T shift shift reduce F  num reduce T  F reduce E  E + T shift reduce Es  E \n accept Rule $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 * $3 $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 + $3 printf $1

23 4. Bottom-up Parsing Reminder
Simple expressions grammar: E => E '+' E // rule r1 E => E '*' E // rule r2 E => id // rule r3

24 Parsing "x + y * z" . x + y * z // shift x . + y * z // reduce(r3)
E . + y * z // shift E + . y * z // shift E + y . * z // reduce(r3) E + E . * z // shift E + E * . z // shift E + E * z . // reduce(r3) E + E * E . // reduce(r2) E + E . // reduce(r1) E . // accept

25 Shift/Reduce Conflict
At step 6, a shift or a reduce is possible. 6. E + E . * z // reduce (r1) 7. E . * z : What should be done? by default, yacc (bison) shifts

26 Reduce/Reduce Conflict
Modify the grammar to include: E => T // new rule r3 E => id // rule r4 T => id // rule r5 continued

27 There are two ways to reduce:
Consider step 2: x . + y * z There are two ways to reduce: E . + y * z // reduce (r4) or T . + y * z // reduce (r5) What should be done? by default, yacc (bison) reduces using the first possible rule (i.e. rule r4)

28 Common Conflicts The two most common shift/reduce problems in prog. languages are: expression precedence dangling else yacc has features for fixing both of these Reduce/reduce problems are usually due to errors in your grammar.

29 Debugging Conflicts bison can generate extra conflict information, which can help you debug your grammar. use the -v option

30 5. Expression Conflicts in shiftE.y shift/reduce here, as in previous
%token NUMBER %% expr: expr '+' expr | expr '*' expr | '(' expr ')' | NUMBER ; shift/reduce here, as in previous example continued

31 %% #include "lex. yy. c" int yyerror(char
%% #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse();

32 Example When the parsing state is: expr '+' expr . '*' z
should bison shift: expr '+' expr '*' . z or reduce?: expr . '*' z // using rule 1

33 Using -v creates a shiftE.output file with extra conflict information
$ bison shiftE.y shiftE.y: conflicts: 4 shift/reduce $ bison -v shiftE.y creates a shiftE.output file with extra conflict information

34 Inside shiftE.output states 9 and 10 are the problems the rules
State 9 conflicts: 2 shift/reduce State 10 conflicts: 2 shift/reduce Grammar 0 $accept: expr $end 1 expr: expr '+' expr 2 | expr '*' expr 3 | '(' expr ')' 4 | NUMBER : // many state blocks states 9 and 10 are the problems the rules are numbered continued

35 when bison is looking at these kinds of parsing states
state 9 1 expr: expr . '+' expr 1 | expr '+' expr . 2 | expr . '*' expr '+' shift, and go to state 6 '*' shift, and go to state 7 '+' [reduce using rule 1 (expr)] '*' [reduce using rule 1 (expr)] $default reduce using rule 1 (expr) bison does this but it could do this continued

36 when bison is looking at these kinds of parsing states
state 10 1 expr: expr . '+' expr 2 | expr . '*' expr 2 | expr '*' expr . '+' shift, and go to state 6 '*' shift, and go to state 7 '+' [reduce using rule 2 (expr)] '*' [reduce using rule 2 (expr)] $default reduce using rule 2 (expr) bison does this but it could do this

37 What causes Expression Conflicts?
The problems are the precedence and associativity of the operators: does * 5 mean (2 + 3) * 5 or 2 + (3 * 5) ? // should be 2nd does mean (1 - 1) - 1 or 1 - (1 - 1) ? // should be 1st * should have higher precedence than +, and – should be left associative.

38 6. Precedence/Associativity in yacc
The declarations section can contain associativity and precedence settings for tokens: %left, %right, %nonassoc precedence is given by the order of the lines Example: %left '+' '-' %left '*' '/' All left associative, with '*' and '/' higher precedence than '+' and '-'.

39 Expressions Variables Compiler
exprVars.l, a flex file flex lex.yy.c exprVarsEval, c executable #include gcc exprVars.y, a bison file bison exprVars.tab.c $ flex exprVars.l $ bison exprVars.y $ gcc exprVars.tab.c -o exprVarsEval

40 Usage I typed these lines. I typed ctrl-D
$ ./exprVarsEval * 3 Value = Value = -1 a = 3 * 4 a Value = 12 b = (3 - 6) * a b Value = -36 $ I typed these lines. I typed ctrl-D

41 exprVars.l the token names are defined in the yacc file
/* Added: RE vars, token names, VAR token, assignment, error msgs */ digits [0-9]+ letter [a-z] %% \n return('\n'); \= return(ASSIGN); \+ return(PLUS); \- return(MINUS); \* return(TIMES); \/ return(DIV); \( return(LPAREN); \) return(RPAREN); the token names are defined in the yacc file continued

42 {letter} { yylval = *yytext - 'a'; return(VAR); } {digits} { yylval = atoi(yytext); return(NUMBER); [ \t] ; /* skip whitespace */ . yyerror("Invalid char"); /* reject everything else */ %% int yywrap(void) { return 1; }

43 exprVars.y /* Added: token names, assoc/precedence ops, changed grammar rules, vars and assignment. */ %token VAR NUMBER ASSIGN PLUS MINUS TIMES DIV LPAREN RPAREN %left PLUS MINUS %left TIMES DIV %{ int symbol[26]; // stores var's values %} %% continued

44 program: program statement '\n' | ; statement: expr { printf("Value = %d\n", $1); } | VAR ASSIGN expr { symbol[$1] = $3; } expr: NUMBER | VAR { $$ = symbol[$1]; } | expr PLUS expr { $$ = $1 + $3; } | expr MINUS expr { $$ = $1 - $3; } | expr TIMES expr { $$ = $1 * $3; } | expr DIV expr { $$ = $1 / $3; } /* integer division */ | LPAREN expr RPAREN { $$ = $2; } %% continued

45 #include "lex. yy. c" int yyerror(char
#include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse();

46 7. Dangling Else Conflict
in iffy.y %token IF ELSE variable %% stmt: expr | if_stmt ; if_stmt: IF expr stmt | IF expr stmt ELSE stmt expr: variable $ bison -v iffy.y iffy.y: conflicts: 1 shift/reduce

47 Shift or Reduce? Current state: Shift choice:
if (x < 5) if (x < 3) y = a – b; else y = b – a; Current state: IF expr IF expr stmt . ELSE stmt Shift choice: IF expr IF expr stmt ELSE . stmt IF expr IF expr stmt ELSE stmt . IF expr stmt . the second ELSE is paired with the second IF continued

48 Reduce option: the second ELSE is paired with the first IF
if (x < 5) if (x < 3) y = a – b; else y = b – a; Reduce option: IF expr IF expr stmt . ELSE stmt IF expr stmt . ELSE stmt IF expr stmt ELSE . stmt IF expr stmt ELSE stmt . the second ELSE is paired with the first IF

49 Inside iffy.output continued
State 8 conflicts: 1 shift/reduce Grammar 0 $accept: stmt $end 1 stmt: expr 2 | if_stmt 3 if_stmt: IF expr stmt 4 | IF expr stmt ELSE stmt 5 expr: variable : // many state blocks continued

50 when bison is looking at these kinds of parsing states
state 8 3 if_stmt: IF expr stmt . 4 | IF expr stmt . ELSE stmt ELSE shift, and go to state 9 ELSE [reduce using rule 3 (if_stmt)] $default reduce using rule 3 (if_stmt) bison does this but it could do this

51 8. Left and Right Recursion
A left recursive rule: list: item | list ',' item ; A right recursion rule: list: item | item ',' list Left recusion keeps the parse table stack smaller, so may be a better choice this is the opposite of top-down

52 9. Error Recovery When an error occurs, yacc/bison calls yyerror() and then terminates. A better approach is to call yyerror(), then try to continue this can be done by using the keyword error in the grammar rules

53 Example If there's an error in the stmt rule, then skip the rest of the input tokens until ';'" or '}' is seen, then continue as before: stmt: ';' | expr ';' | VAR '=' expr ';' | '{' stmt_list '}' | error ';' | error '}' ;

54 10. Embedded Actions Actions can be placed anywhere in a rule, not just at the end: listPair: item1 { do_item1($1); } item2 { do_item2($3); } the action variable in the second action block is $3 since the first action block is counted as part of the rule

55 11. More Information in our library Lex and Yacc by Levine, Mason, and Brown O'Reilly; 2nd edition On UNIX: man yacc info yacc continued

56 A Compact Guide to Lex & Yacc by Tom Niemann http://epaperpress
with several yacc calculator examples, which I'll be discussing in the next few chapters The Lex & Yacc Page documentation and tools continued

57 Compiler Construction using Flex and Bison by Anthony A. Aaby
in the "Useful Info" subdirectory of the course website


Download ppt "Compiler Structures 7. Yacc Objectives , Semester 2,"

Similar presentations


Ads by Google