Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

Structure of a YACC File Has the same three-part structure as Lex Each part is separated by a % symbol The three parts are even identical: – definition.
CS252: Systems Programming Ninghui Li Topic 4: Regular Expressions and Lexical Analysis.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Context-Free Grammars Lecture 7
Tutorial 1 Scanner & Parser
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Parser construction tools: YACC
C Chuen-Liang Chen, NTUCS&IE / 1 COMPILER Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
LEX and YACC work as a team
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
CONTEXT FREE GRAMMAR presented by Mahender reddy.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Introduction to Parsing
Introduction to Lex Ying-Hung Jiang
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Introduction to Yacc Ying-Hung Jiang
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
Lex.
Introduction to YACC Panfeng Xue
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Introduction to YACC CS 540 George Mason University.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
2-1. LEX & YACC. 2 Overview  Syntax  What its program looks like –Context-free grammar, BNF  Syntax-directed translation –A grammar-oriented compiling.
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Syntax Analysis Part III
A Simple Syntax-Directed Translator
Tutorial On Lex & Yacc.
Compiler Construction
Syntax Analysis Part III
Bison: Parser Generator
Syntax-Directed Definition
Syntax Analysis Part III
Syntax Analysis Part III
Review: Compiler Phases:
Subject Name:Sysytem Software Subject Code: 10SCS52
Syntax Analysis Part III
Lecture 4: Lexical Analysis & Chomsky Hierarchy
Yacc Yacc.
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Presentation transcript:

Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani

2 (1) Parsers Parsers are programs that create parse trees from source programs. Many aspects of a programming language have a structure that may be described by REs. (e.g. Identifiers could be represented by RE using lex analyzer). However, there are some very important aspects of programming languages that cannot be represented be REs. (Typical languages use parentheses and/or brackets in a nested and balanced fashion). Example #1: A grammar G bal = ({B}, {(, )}, P, B), where P consists of: B -> BB | (B) | ε Example #2: A grammar that generates the possible sequences of if and else in C (represented as i and e, respectively) is: S -> SS | iS | iSeS | ε Q: Can we generate the following strings using the above grammar, And why?: ieie, iie, ei, iei, ieeii ? How about: iieie ?

3 The answer for the last one is yes. Because the iieie corresponds to a C program whose structure is like: if (Condition) { … if (Condition) Statement; else Statement; … if (Condition) Statement; else Statement; … }

4 Lexical Analyzer Syntax Analyzer Code Generator Compilation Sequence a = b + c * d id1 = id2 + id3 * id4 = + * id4 id1 id2 id3 load id3 mul id4 add id2 store id1 Source code Generated code Syntax tree tokens

5 (2) The YACC Parser-Generator Yacc and lex are very closely related. The fact that both program generators are often used in combination should not be surprising. The structure of a yacc program closely resembles the structure of a lex program. A yacc program has the following structure. % % A yacc program describes the production rules for a context-free grammar. (A yacc program usually has a “.y" suffix.) Yacc generates a procedure yyparse() that processes a stream of tokens generated by yylex() and attempts to match a sentence in the specified language. Notice that yacc (yyparse()) calls the scanner when it needs the next token. The scanner is called yylex(). This may or may not be generated by lex.The output of yacc is placed in a file called y.tab.c, unless otherwise specified.

6 yacc lex y.tab.c lex.yy.c bas.y bas.l y.tab.h (yyparse) (yylex) bas.exe Compiled output source (Building a compiler with Lex/Yacc) Commands to create our compiler, bas.exe, are: yacc –d bas.y# create y.tab.h, y.tab.c lex bas.l# create lex.yy.c cc lex.yy.c y.tab.c –obas.exe # compile/link cc

7 You may use a lex generated version of yylex() by simply including the statement: #include "lex.yy.c" in the program section of the yacc definition file. The declaration section contains declaration statements such as: %token TK_IDand %start set The heart of a yacc program is the rules section. This section describes the grammar productions and actions to perform once those productions are realized. For example, a typical grammar rule might be the following: set : '{' list_of_ids '}‘ ; Here set and list_of_ids are variables (nonterminals) and '{' and '}' are terminals. (The semicolon in the above rule definition denotes the end of a sequence of production rules.) Similarly, we might define list_of_ids :TK_ID | TK_ID ',' list_of_ids ;

8 where list_of_ids is a variable and TK_ID and ',' are tokens. Notice that alternation is denoted by “|” in our grammar rules. Yacc works with attribute grammars, i.e., those grammars in which every nonterminal and terminal may have an associated attribute or value. In yacc actions, these attributes may be read and/or set when needed. The attribute of the variable (nonterminal) on the left-hand side is denoted by $$. The attributes of the other elements in a production may be accessed by their number. For example, if the variable EXPR has an integer attribute, then the following production rule and action are appropriate. EXPR : EXPR + EXPR {$$ = $1 + $3;} ;  Example 1: Here is a lex program that removes comments, tabs, new lines, etc. It returns { }, ; = TKID, and TK_COLORS as tokens.

9 /* exam-y.y: Use strings and sets as yacc types */ %{ struct color_list { char *my_color; struct color_list *next; }; %} %union { char *t_val; struct color_list *color_set; } /* These are the token types that exam-l.l returns */ %token TKID %token TK_COLORS /* The token TKID returns the type t_val. */ %type TKID %type color_def list_of_ids %start color_def;

10 Here are the production rules: % color_def : TK_COLORS '=' '{' list_of_ids '}' ';' { print_set($4);} ; list_of_ids : TKID { struct color_list *set1; set1 = (struct color_list *)malloc(sizeof(struct color_list)); set1->next = NULL; set1->my_color = $1; $$ = set1; } | TKID ',' list_of_ids { struct color_list *set1; set1 = (struct color_list *)malloc(sizeof(struct color_list)); set1->next = $3; set1->my_color = $1; $$ = set1; } ; %

11 /* exam-l.l: Here is a lex program that removes comments, tabs, newlines, etc. */ /* It returns { }, ; = TKID, and TK_COLORS as tokens. */ % [ \t\n\f] {ACC(yytext[0]); /* Remove tabs/spaces/newlines */} \/\* {char c; int line_cur; line_cur = linecount; while (1) {if ((c = input()) == EOF) { /* If this is the case, there is an error, an unterminated comment. */ printf("Detected unterminated comment starting on line %d \n",line_cur); return(0); } ACC(c);

12 if (c == '*') { if ((c = input()) == '/') { break; } else {unput(c);} } } colors {printf("%s\n",yytext); return TK_COLORS;} [a-zA-Z_.][a-zA-Z_.0-9]* { /* Copy yytext to yylval.string_val */ yylval.t_val = strdup(yytext); return TKID;} [{},;=] {return yytext[0];}. {printf("Illegal Character %s on line %d\n", yytext,linecount); printf("Ignored \n"); ACC(yytext[0]);} %

13  Input: /* This is a test of a colors file. */ colors = {red, green, blue, white}; /* End of test. */  Output: Reading a colors definition Equals Beginning of set Color = red Separator Color = green Separator Color = blue Separator Color = white Semicolon

14  Example 2: This is a yacc program that acts as an interpreter for a simple language called SSET that manipulates STRINGS and SETS of STRINGS.. It has two classes: (1) set_list: that represents sets and their functions like: Searching, storing, set union, set intersection, set difference, and printing contents of a set. (2) symtab: that represents a symbol table that stores information about each variable such as: variable name, type, and its values. - Here are the production rules from YACC file without their actions (Too long to fit here!).

15 % Sset: declaration program |{$$ = NULL; } ; declaration : TK_SET TK_ID setdeclar ';'{ … } | TK_STRING TK_ID strdeclar ';'{ … } ; setdeclar: ',' TK_ID setdeclar |{$$ = NULL;} ; program: declaration program{$$ = $1; } | statement program |{/* if lamda */ $$ = NULL; } ; statement: TK_ID '=' simp_exp{ … } | TK_DISPLAY TK_STR_CONST ';'{ … } | TK_DISPLAY TK_ID ';' { … } ; simp_exp: TK_ID';'{ … } | TK_STR_CONST ';'{ … } | set_def{$$ = $1; } | bin_exp{$$ = $1; } set_def: '{' list_of_ids '}' ';' {$$ = $2; } ; list_of_ids: TK_ID{ … } | TK_STR_CONST{ … } | TK_ID ',' list_of_ids{ … } | TK_STR_CONST ',' list_of_ids |{/* if lamda */ $$ = NULL; } ; bin_exp: bin1 ';'{ $$ = $1; } | bin2 ';'{ $$ = $1; } | bin3 ';'{ $$ = $1; } | bin4 ';'{ $$ = $1; } ; bin1: TK_ID '+' TK_ID{ … } | TK_ID '+'{ … } ; bin2: TK_ID '*' TK_ID{ … } | TK_ID '*' '{' TK_ID '}'{ … } ; bin3: TK_ID '-' TK_ID{ … } | TK_ID '-' '{' TK_ID '}'{ … } ; bin4: '(' bin1 ')‘ | TK_ID{ … } %

16  Input: SET s1; STRING s2; s2 = "John"; s1 = {s2,"Paul","Ringo", "George"}; DISPLAY "The Beatles ---- "; DISPLAY s1; s1 = s1 - {s2}; DISPLAY s1; s1={}; DISPLAY s1;  Output: The Beatles ---- {John, Paul, Ringo, George} {Paul, Ringo, George} { }

17  Compilation (Makefile): CFLAG = -g sset: y.tab.o g++ -g -o sset y.tab.o y.tab.o: y.tab.c lex.yy.c g++ -c $(CFLAG) y.tab.c y.tab.c: start.y yacc start.y lex.yy.c: start.l flex start.l Other References:    