Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.

Slides:



Advertisements
Similar presentations
Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Advertisements

Chapter 3 Syntax Analysis
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
1 JavaCUP JavaCUP (Construct Useful Parser) is a parser generator Produce a parser written in java, itself is also written in Java; There are many parser.
Pushdown Automata Consists of –Pushdown stack (can have terminals and nonterminals) –Finite state automaton control Can do one of three actions (based.
Lexical and Syntactic Analysis Here, we look at two of the tasks involved in the compilation process –Given source code, we need to first break it into.
Yacc YACC BNF grammar example.y Other modules example.tab.c Executable
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Yacc Examples Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Context-Free Grammars Lecture 7
Tutorial 1 Scanner & Parser
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
LEX and YACC work as a team
1 Using Yacc: Part II. 2 Main() ? How do I activate the parser generated by yacc in the main() –See mglyac.y.
Syntax and Backus Naur Form
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
History of C 1950 – FORTRAN (Formula Translator) 1959 – COBOL (Common Business Oriented Language) 1971 – Pascal Between Ada.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Lab 3: Using ML-Yacc Zhong Zhuang
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
LANGUAGE TRANSLATORS: WEEK 14 LECTURE: REGULAR EXPRESSIONS FINITE STATE MACHINES LEXICAL ANALYSERS INTRO TO GRAMMAR THEORY TUTORIAL: CAPTURING LANGUAGES.
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
Introduction to Lex Fan Wu
Introduction to YACC Panfeng Xue
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
Lecture 7. YACC YACC can parse input streams consisting of tokens with certain values. This clearly describes the relation YACC has with Lex, YACC has.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
What am I? while b != 0 if a > b a := a − b else b := b − a return a AST == Abstract Syntax Tree.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
More LR Parsing and Bison CPSC 388 Ellen Walker Hiram College.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
YACC SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
CS 310 – Fall 2008 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
Yacc.
Lexical Analysis.
Tutorial On Lex & Yacc.
CS510 Compiler Lecture 4.
Chapter 4 Syntax Analysis.
Context-free Languages
TDDD55- Compilers and Interpreters Lesson 2
Bison: Parser Generator
Syntax Analysis Sections :.
Bison Marcin Zubrowski.
Subject Name:Sysytem Software Subject Code: 10SCS52
Appendix B.1 Lex Appendix B.1 -- Lex.
Compiler Lecture Note, Miscellaneous
Compiler Structures 7. Yacc Objectives , Semester 2,
Compiler Design Yacc Example "Yet Another Compiler Compiler"
CMPE 152: Compiler Design December 4 Class Meeting
Systems Programming & Operating Systems Unit – III
Faculty of Computer Science and Information System
Lex Appendix B.1 -- Lex.
Presentation transcript:

Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined functions...

Definitions Anything enclosed between %{... %} in this section will be copied straight into y.tab.h (the C source for the parser). All #include and #define statements, all variable declarations, all function declarations and any comments should be placed here. Any terminal symbols which will be used in the grammar must be declared in this section as a token. E.g. %token VERB_T %token NOUN_T Convention: tokens will be written as upper case, ending in "_T", while non-terminals will be written in mixed case. Non-terminals do not need to be pre-declared.

Productions To specify productions A -> a b c | e f g we write A : a b c | e f g ; Use lots of white space for readability: A_nt : A_T B_T C_T | E_T F_T G_T ; Comments can be included in C syntax: /* A rewrites to abc or efg */

Functions Section This section contains the user-defined main() routine, plus any other required functions. It is usual to include: lexerr() - to be called if the lexical analyser finds an undefined token. The default case in the lexical analyser must therefore call this function. yyerror(char*) - to be called if the parser cannot recognise the syntax of part of the input. The parser will pass a string describing the type of error. The line number of the input when the error occurs is held in yylineno. The last token read is held in yytext.

Running Yacc % yacc -d -v my_prog.y % cc y.tab.c -ly The -d option creates a file " y.tab.h ", which contains a #define statement for each terminal declared. Place #include "y.tab.h" in between the %{ and %} to use the tokens in the functions section. The -v option creates a file " y.output ", which contains useful information on debugging. We can use Lex to create the lexical analyser. If so, we should also place #include "y.tab.h" in Lex's definitions section, and we must link the parser and lexer together with both libraries ( -ly and -ll ).

Errors Yacc can not accept ambiguous grammars, nor can it accept grammars requiring two or more symbols of lookahead. The two most common error messages are: shift-reduce conflict reduce-reduce conflict The first case is where the parser would have a choice as to whether it shifts the next symbol from the input, or reduces the current symbols on the top of the stack. The second case is where the parser has a choice of rules to reduce the stack.

Example Errors Expr : INT_T | Expr + Expr ; causes a shift-reduce error, because INT_T + INT_T + INT_T can be parsed in two ways. Animal : Dog | Cat ; Dog : FRED_T; Cat : FRED_T; causes a reduce-reduce error, because FRED_T can be parsed in two ways.

errors (cont.) Do not let errors go uncorrected. A parser will be generated, but it may produce unexpected results. Study the file "y.output" to find out when the errors occur. It is very unlikely that you are trying to define a language that is not suitable. The SUN C compiler and the Berkeley PASCAL compiler are both written in Yacc. You should be able to change your grammar rules to get an unambiguous grammar.

Example Yacc Script S -> NP VP NP -> Det NP1 | PN NP1 -> Adj NP1| N Det -> a | the PN -> peter | paul | mary Adj -> large | grey N -> dog | cat | male | female VP -> V NP V -> is | likes | hates We want to write a Yacc script which will handle files with multiple sentences from this grammar. Each sentence will be delimited by a ".". Change the first production to S -> NP VP. and add D -> S D | S

The Lex Script %{ /* simple part of speech lexer */ #include "y.tab.h" %} L [a-zA-Z] % [ \t\n]+/* ignore space */ is|likes|hatesreturn VERB_T; a|thereturn DET_T; dog | cat | male | femalereturn NOUN_T; peter|paul|maryreturn PROPER_T; large|greyreturn ADJ_T; \.return PERIOD_T; {L}+lexerr();.lexerr(); %

Yacc Definitions %{ /* simple natural language grammar */ #include #include "y.tab.h" extern in yyleng; extern char yytext[]; extern int yylineno; extern int yyval; extern int yyparse(); %} %token DET_T %token NOUN_T %token PROPER_T %token VERB_T %token ADJ_T %token PERIOD_T %

Yacc rules /* a document is a sentence followed by a document, or is empty */ Doc : Sent Doc | /* empty */ ; Sent : NounPhrase VerbPhrase PERIOD_T ; NounPhrase : DET_T NounPhraseUn | PROPER_T ; NounPhraseUn : ADJ_T NounPhraseUn | NOUN_T ; VerbPhrase : VERB_T NounPhrase ; %

User-defined functions void lexerr() { printf("Invalid input '%s' at line%i\n", yytext,yylineno); exit(1); } void yyerror(s) char *s; { (void)fprintf(stderr, "%s at line %i, last token: %s\n", s, yylineno, yytext); } void main() { if (yyparse() == 0) printf("Parse OK\n"); else printf("Parse Failed\n"); }

Running the example % yacc -d -v parser.y % cc -c y.tab.c % lex parser.l % cc -c lex.yy.c % cc y.tab.o lex.yy.o -o parser -ly -ll peter is a large grey cat. the dog is a female. paul is peter. the cat is mary. a dogcat is a male. peter is male. mary is a female. % parser < file1 Parse OK % parser < file2 Invalid input 'dogcat' at line 2 % parser < file3 syntax error at line 1, last token: male file1 file2 file3