Sung-Dong Kim, Dept. of Computer Engineering, Hansung University

Slides:



Advertisements
Similar presentations
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
Advertisements

1 Chapter 2: Scanning 朱治平. Scanner (or Lexical Analyzer) the interface between source & compiler could be a separate pass and places its output on an.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
CS 536 Spring Learning the Tools: JLex Lecture 6.
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
Saumya Debray The University of Arizona Tucson, AZ 85721
LEX and YACC work as a team
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Using the LALR Parser Generator yacc By J. H. Wang May 10, 2011.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
1 YACC Parser Generator. 2 YACC YACC (Yet Another Compiler Compiler) Produce a parser for a given grammar.  Compile a LALR(1) grammar Original written.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Introduction to Lex Ying-Hung Jiang
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Introduction to Yacc Ying-Hung Jiang
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
Introduction to Lex Fan Wu
Lex.
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Syntactic Analysis Tools
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
C Chuen-Liang Chen, NTUCS&IE / 35 SCANNING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
Compiler Principle and Technology Prof. Dongming LU Feb. 28th, 2014.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
YACC Primer CS 671 January 29, CS 671 – Spring Yacc Yet Another Compiler Compiler Automatically constructs an LALR(1) parsing table from.
Parser Generation Tools (Yacc and Bison) CS 471 September 24, 2007.
1 Syntax Analysis Part III Chapter 4 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
YACC SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
Syntax error handling –Errors can occur at many levels lexical: unknown operator syntactic: unbalanced parentheses semantic: variable never declared runtime:
Yacc.
COMPILER CONSTRUCTION
Sung-Dong Kim, School of Computer Engineering, Hansung University
Syntax Analysis Part III
Tutorial On Lex & Yacc.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Using SLK and Flex++ Followed by a Demo
Chapter 4 Syntax Analysis.
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
Syntax Analysis Part III
Simple, efficient;limitated
Syntax Analysis Part III
Syntax Analysis Part III
Subject Name:Sysytem Software Subject Code: 10SCS52
Syntax Analysis Part III
Compiler Lecture Note, Miscellaneous
Compiler Structures 7. Yacc Objectives , Semester 2,
Saumya Debray The University of Arizona Tucson, AZ 85721
Compiler Design Yacc Example "Yet Another Compiler Compiler"
CMPE 152: Compiler Design December 4 Class Meeting
Systems Programming & Operating Systems Unit – III
CSc 453 Lexical Analysis (Scanning)
Compiler Design 3. Lexical Analyzer, Flex
Presentation transcript:

Sung-Dong Kim, Dept. of Computer Engineering, Hansung University LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University

LEX Input: tiny.l Output: lex.yy.c or lexyy.c Procedure yylex Table-driven implementation of a DFA Similar to “getToken” RE + action Scanner (C code) Lex (2011-1) Compiler

LEX Convention (1) Metacharacters Quotes: actual characters Backslash For not metacharacters: “if”, if For metacharacters: “(” Backslash \(\* = “\*” \n, \t (aa|bb)(a|b)*c? = (“aa”|“bb”)(“a”|“b”)* “c”? (2011-1) Compiler

LEX Convention (2) [...] : any one of them Hyphen [abxz]: any one of the characters a, b, x, z (aa|bb)(ab)*c? Hyphen Ranges of characters [0-9] (2011-1) Compiler

LEX Convention (3) . ^ Represents a set of characters Any character except a newline ^ Complementary sets [^0-9abc]: any character that is not a digit and is not one of the letter a, b, c (2011-1) Compiler

LEX Convention (4) Square bracket Most of the metacharacters lose their special status [-+] == (“+”|“-”) [+-]: from “+”, all characters [.”?]: any of the three characters ., ”, ? [\^\\]: ^ or \ (2011-1) Compiler

LEX Convention (5) Curly bracket Names of regular expressions nat = [0-9]+ signedNat = (“+”|“-”)? nat nat [0-9]+ signedNat (“+”|“-”)? {nat} (2011-1) Compiler

Format of LEX Input (1) Input file = regular expression + C code Definitions Any C code that must be inserted to any function - %{…}% Names of regular expressions Rules Regular expressions + C code (action) Auxiliary routines (optional) C code + main program (if needed) (2011-1) Compiler

Format of LEX Input (2) Layout {definitions} %% {rules} {auxiliary routines} (2011-1) Compiler

Example 1: scanner that adds line numbers to text %{ /* a Lex program that adds line numbers to lines of text, printing the new text to the standard output */ #include <stdio.h> int lineno = 1; %} line .*\n %% {line} {printf(“%5d %s”,lineno++,yytext); } main() { yylex(); return 0; } (2011-1) Compiler

Example 2: prints the count of # of replacements %{ /* a Lex program that changes all numbers from decimal to hexadecimal notation, printing a summary statistic stderr */ #include <stdlib.h> #include <stdio.h> int count = 0; %} digit [0-9] number {digit}+ %% {number} { int n = atoi(yytext); printf(“%x”, n); if (n > 9) count++; } (2011-1) Compiler

fprintf(stderr, “number of replacements = %d”, count); return 0; } main() { yylex(); fprintf(stderr, “number of replacements = %d”, count); return 0; } (2011-1) Compiler

Example 3: prints all input lines that begin or end with the ‘a’ %{ /* Selects only lines that end or begin with the letter ‘a’. Deletes everything else. */ #include <stdio.h> %} ends_with_a .*a\n begins_with_a a.*\n %% {ends_with_a} ECHO; {begins_with_a} ECHO; .*\n ; main() { yylex(); return 0; } (2011-1) Compiler

Summary (1) Ambiguity resolution The principles of longest substring Substring with equal length: first-match first-serve No match: copy the next character and continue (2011-1) Compiler

Summary (2) Insertion of C Code %{ … %}: exact copy Auxiliary procedure section: exact copy at the end Any code following a RE (action): at the appropriate place in yylex (2011-1) Compiler

Lex Internal Names lex.yy.c: Lex output file name or lexyy.c yylex: Lex scanning routine yytext: String matched on current action yyin: Lex input file (default: stdin) yyout: Lex output file (default: stdout) input: Lex buffered input routine ECHO: Lex default action (print yytext to yyout) (2011-1) Compiler

LEX for TINY %{ #include “globals.h” #include “util.h” #include “scan.h” /* lexeme of identifier or reserved word */ char tokenString[MAXTOKENLEN+1]; */ digit [0-9] number {digit}+ letter [a-zA-Z] identifier {letter}+ newline \n whitespace [ \t] %% (2011-1) Compiler

“repeat” { return REPEAT; } “until” { return UNTIL; } “if” { return IF; } “then” { return THEN; } “else” { return ELSE; } “end” { return END; } “repeat” { return REPEAT; } “until” { return UNTIL; } “read” { return READ; } “write” { return WRITE; } “:=” { return ASSIGN; } “=” { return EQ; } “<” { return LT; } “+” { return PLUS; } “-” { return MINUS; } “*” { return TIMES; } “/” { return OVER; } “(” { return LPAREN; } “)” { return RPAREN; } “;” { return SEMI; } (2011-1) Compiler

{number} { return NUM; } {identifier} { return ID; } {newline} { lineno++; } {whitespace} { /* skip whitespace */ } “{” { char c; do { c = input(); if (c == ‘\n’) lineno++; } while (c != ‘}’); } . { return ERROR; } %% (2011-1) Compiler

TokenType getToken(void) { static int firstTime = TRUE; TokenType currentToken; if (firstTime) { firstTime = FALSE; lineno++; yyin = source; yyout = listing; } currentToken = yylex(); strncpy(tokenString, yytext, MAXTOKENLEN); if (TraceScan) { fprintf(listing, “\t%d: “, lineno); printToken(currentToken, tokenString); return currentToken; (2011-1) Compiler

YACC LALR(1) parser generator Yet another compiler compiler syntax spec. parser (2011-1) Compiler

YACC Basics (1) Input/output Specification file format Yacc filename.y y.tab.c ytab.c filename.tab.c {definitions} %% {rules} {auxiliary routines} (2011-1) Compiler

YACC Basics (2) Definitions Rules Auxiliary routines Information about tokens, data types, grammar rules C code  output file Rules Modified BNF format C code Auxiliary routines Procedure and function declarations main()  yyparse()  yylex() (2011-1) Compiler

#include <stdio.h> #include <ctype.h> %} %token NUMBER %% %{ #include <stdio.h> #include <ctype.h> %} %token NUMBER %% command : exp {printf(“%d\n”,$1);} exp : exp ‘+’ term {$$ = $1 + $3;} | exp ‘-’ term {$$ = $1 - $3;} | term {$$ = $1;} ; term : term ‘*’ factor {$$ = $1 * $3;} | factor {$$ = $1;} factor : NUMBER {$$ = $1;} | ‘(’ exp ‘)’ {$$ = $2;} (2011-1) Compiler

while((c = getchar()) == ‘ ‘); /* blank 제거 */ if (isdigit(c)) { main() { return yyparse(); } int yylex(void) { int c; while((c = getchar()) == ‘ ‘); /* blank 제거 */ if (isdigit(c)) { ungetc(c,stdin); scanf(“%d”,&yylval); return(NUMBER); if (c == ‘\n’) return 0; /* 파싱 정지 */ return(c); void yyerror(char *s) { fprintf(stderr,”%s\n”,s); /* 에러메시지 출력*/ return 0; (2011-1) Compiler

YACC Options (1) -d Header file generation yacc –d filename.y y.tab.h, ytab.h, filename.tab.h Other file #include y.tab.h Call yylex() (2011-1) Compiler

YACC Options (2) -v option Verbose option yacc –d filename.y y.output (2011-1) Compiler

$accept : command_$end $end accept . error state 2 command : exp_ (1) NUMBER shift 5 ( shift 6 . error command goto 1 exp goto 2 term goto 3 factor goto 4 state 1 $accept : command_$end $end accept . error state 2 command : exp_ (1) exp : exp_+ term exp : exp_- term + shift 7 - shift 8 . reduce 1 state 3 exp : term_ (4) term : term_* factor * shift 9 . reduce 4 state 4 term : factor_ (6) . reduce 6 (2011-1) Compiler

state 5 factor : NUMBER_ (7) . reduce 7 state 6 factor : (_exp ) NUMBER shift 5 ( shift 6 . error exp goto 10 term goto 3 factor goto 4 state 7 exp : exp +_term NUMBER shift 5 ( shift 6 . error term goto 11 factor goto 4 state 8 exp : exp -_term term goto 12 (2011-1) Compiler

state 9 term : term *_factor NUMBER shift 5 ( shift 6 . error factor goto 13 state 10 exp : exp_+ term exp : exp_- term factor : ( exp_) + shift 7 - shift 8 ) shift 14 state 11 exp : exp + term_ (2) term : term_* factor * shift 9 . reduce 2 state 12 exp : exp – term_ (3) . reduce 3 state 13 term : term * factor_ (5) . reduce 5 (2011-1) Compiler

8/127 terminals, 4/600 nonterminals state 14 factor : ( exp )_ (8) . reduce 8 8/127 terminals, 4/600 nonterminals 9/300 grammar rules, 15/1000 states 0 shift/reduce, 0 reduce/reduce conflicts reported 9/601 working sets used memory: states, etc. 36/2000, parser 11/4000 9/601 distinct lookahead sets 6 extra closures 18 shift entries, 1 exceptions 8 goto entries 4 entries saved by goto default Optimizer space used: input 50/2000, output 218/4000 218 table entries, 202 zero maximum spread: 257, maximum offset: 43 (2011-1) Compiler