Introduction to Lex Ying-Hung Jiang

Slides:



Advertisements
Similar presentations
Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Advertisements

Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
CS252: Systems Programming Ninghui Li Topic 4: Regular Expressions and Lexical Analysis.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Lecture 2: Lexical Analysis CS 540 George Mason University.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Parser construction tools: YACC
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Lexical analysis.
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Scanning & Parsing with Lex and YACC
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Miscellaneous 컴파일러 입문.
Lecture 2: Lexical Analysis
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
Compiler Tools Lex/Yacc – Flex & Bison. Compiler Front End (from Engineering a Compiler) Scanner (Lexical Analyzer) Maps stream of characters into words.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Lexical Analysis – Part I EECS 483 – Lecture 2 University of Michigan Monday, September 11, 2006.
Introduction to Yacc Ying-Hung Jiang
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
COMPILERS AND INTERPRETERS Lesson 3 – TDDD16 TDDB44 Compiler Construction 2010 Kristian Stavåker Department.
Introduction to Lex Fan Wu
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Lecture 3 RegExpr  NFA  DFA Topics Lex, Flex Thompson Construction Subset construction (maybe) Readings: , 3.7, 3.6 January 18, 2006 CSCE 531.
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
1 LEX & YACC Tutorial February 28, 2008 Tom St. John.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
1 Steps to use Flex Ravi Chotrani New York University Reviewed By Prof. Mohamed Zahran.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Lexical Analysis.
NFAs, scanners, and flex.
Tutorial On Lex & Yacc.
RegExps & DFAs CS 536.
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
JLex Lecture 4 Mon, Jan 26, 2004.
Review: Compiler Phases:
Subject Name:Sysytem Software Subject Code: 10SCS52
Compiler Structures 3. Lex Objectives , Semester 2,
Compiler Lecture Note, Miscellaneous
Compiler Design Yacc Example "Yet Another Compiler Compiler"
More on flex.
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
NFAs, scanners, and flex.
Compiler Design 3. Lexical Analyzer, Flex
Presentation transcript:

Introduction to Lex Ying-Hung Jiang

Outlines Review of scanner Introduction to flex Regular Expression Cooperate with Yacc Resources

Review Parser -> scanner -> source code See teacher ’ s slide

flex

flex - fast lexical analyzer generator Flex is a tool for generating scanners. Flex source is a table of regular expressions and corresponding program fragments. Generates lex.yy.c which defines a routine yylex()

Format of the Input File The flex input file consists of three sections, separated by a line with just % in it: definitions % rules % user code

Definitions Section The definitions section contains declarations of simple name definitions to simplify the scanner specification. Name definitions have the form: name definition Example: DIGIT [0-9] ID [a-z][a-z0-9]*

Rules Section The rules section of the flex input contains a series of rules of the form: pattern action Example: {ID} printf( "An identifier: %s\n", yytext ); The yytext and yylength variable. If action is empty, the matched token is discarded.

Action If the action contains a ‘{‘, the action spans till the balancing ‘}‘ is found, as in C. An action consisting only of a vertical bar ( '|' ) means "same as the action for the next rule. “ The return statement, as in C. In case no rule matches: simply copy the input to the standard output (A default rule).

Precedence Problem For example: a “ < “ can be matched by “ < “ and “ <= “. The one matching most text has higher precedence. If two or more have the same length, the rule listed first in the flex input has higher precedence.

User Code Section The user code section is simply copied to lex.yy.c verbatim. The presence of this section is optional; if it is missing, the second % in the input file may be skipped. In the definitions and rules sections, any indented text or text enclosed in %{ and %} is copied verbatim to the output (with the %{} 's removed).

A Simple Example %{ int num_lines = 0, num_chars = 0; %} % \n++num_lines; ++num_chars;.++num_chars; % main() { yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); }

Any Question So Far?

Regular Expression

Regular Expression (1/3) x match the character 'x'. any character (byte) except newline [xyz] a "character class"; in this case, the pattern matches either an 'x', a 'y', or a 'z' [abj-oZ] a "character class" with a range in it; matches an 'a', a 'b', any letter from 'j' through 'o', or a 'Z' [^A-Z] a "negated character class", i.e., any character but those in the class. In this case, any character EXCEPT an uppercase letter. [^A-Z\n] any character EXCEPT an uppercase letter or a newline

Regular Expression (2/3) r* zero or more r's, where r is any regular expression r+ one or more r's r? zero or one r's (that is, "an optional r") r{2,5} anywhere from two to five r's r{2,} two or more r's r{4} exactly 4 r's {name} the expansion of the "name" definition (see above) "[xyz]\"foo “ the literal string: [xyz]"foo \X if X is an 'a', 'b', 'f', 'n', 'r', 't', or 'v', then the ANSI-C interpretation of \x. Otherwise, a literal 'X' (used to escape operators such as '*')

Regular Expression (3/3) \0 a NUL character (ASCII code 0) \123 the character with octal value 123 \x2a the character with hexadecimal value 2a (r) match an r; parentheses are used to override precedence (see below) rs the regular expression r followed by the regular expression s; called "concatenation" r|s either an r or an s ^r an r, but only at the beginning of a line (i.e., which just starting to scan, or right after a newline has been scanned). r$ an r, but only at the end of a line (i.e., just before a newline). Equivalent to "r/\n".

Cooperate with Yacc The y.tab.h file (generated by – d option). See pascal.l and pascal.y for example.

Resources Google directory of lexer and parser generators. Flex homepage: Lex/yacc Win32 port: yacc/lex-yacc.html yacc/lex-yacc.html Above links are available on course webpage. Flex(1)

Any Question?