Lexical Analysis - Scanner- Contd Computer Science Rensselaer Polytechnic 66.648 Compiler Design Lecture 4(01/26/98)

Slides:



Advertisements
Similar presentations
JavaScript I. JavaScript is an object oriented programming language used to add interactivity to web pages. Different from Java, even though bears some.
Advertisements

Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Longest Common Subsequence
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
Lexical Analysis - Scanner Computer Science Rensselaer Polytechnic Compiler Design Lecture 2.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Yacc Examples Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
Hash Tables1 Part E Hash Tables  
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
Scripting Languages Chapter 8 More About Regular Expressions.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Attribute Grammar Examples and Symbol Tables Compiler Design Lecture (02/23/98) Computer Science Rensselaer Polytechnic.
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
ASP.NET Programming with C# and SQL Server First Edition Chapter 5 Manipulating Strings with C#
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
CPSC 388 – Compiler Design and Construction Scanners – JLex Scanner Generator.
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
COMP313A Programming Languages Lexical Analysis. Lecture Outline Lexical Analysis The language of Lexical Analysis Regular Expressions.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
Chapter 12: String Manipulation Introduction to Programming with C++ Fourth Edition.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
Introduction to Lex Ying-Hung Jiang
Introduction to sed. Sed : a “S tream ED itor ” What is Sed ?  A “non-interactive” text editor that is called from the unix command line.  Input text.
Introduction to Lex Fan Wu
Practical 1-LEX Implementation
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
Lexical Analysis - Scanner- Contd Computer Science Rensselaer Polytechnic Compiler Design Lecture 3(01/21/98)
1 String Processing CHP # 3. 2 Introduction Computer are frequently used for data processing, here we discuss primary application of computer today is.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Syntax-Directed Definitions and Attribute Evaluation Compiler Design Lecture (02/18/98) Computer Science Rensselaer Polytechnic.
CS 614: Theory and Construction of Compilers Lecture 5 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
LECTURE 6 Scanning Part 2. FROM DFA TO SCANNER In the previous lectures, we discussed how one might specify valid tokens in a language using regular expressions.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
Unix RE’s Text Processing Lexical Analysis.   RE’s appear in many systems, often private software that needs a simple language to describe sequences.
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
Sung-Dong Kim, School of Computer Engineering, Hansung University
Lexical Analysis.
NFAs, scanners, and flex.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Using SLK and Flex++ Followed by a Demo
Review: Compiler Phases:
Subject Name:Sysytem Software Subject Code: 10SCS52
CS 3304 Comparative Languages
Syntax Analysis - Parsing
Compiler Structures 3. Lex Objectives , Semester 2,
Appendix B.1 Lex Appendix B.1 -- Lex.
Compiler Structures 2. Lexical Analysis Objectives
Lexical Elements & Operators
More on flex.
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
Compiler Design 3. Lexical Analyzer, Flex
Lexical Analysis - Scanner-Contd
Lex Appendix B.1 -- Lex.
Presentation transcript:

Lexical Analysis - Scanner- Contd Computer Science Rensselaer Polytechnic Compiler Design Lecture 4(01/26/98)

Lecture Outline More on Lex More on Lex Examples and Applications Examples and Applications Administration Administration

LEX Input to Lex consists of three parts, separated by lines beginning with %: first part % pattern action % third part first and third parts are optional.

LEX- Contd The first part contains the dimensions of certain tables internal to lex - also may contain definitions of text replace- ments. It can also contain global C code preceded by a line beginning %{ and ending with %} The third part contains C code which is used as us. It usually contains functions which the second part uses. The first separator (%) is essential, whereas the second separator (%) is not needed if the third part is empty.

Patterns Letters, digits and some special characters represent themselves. Period (.) represents any character other than line feed (\n) Brackets ([ and ]) enclose a sequence of characters, called a character class. The class represents any one of its members or any single character not in the class, if the class starts with ^. Within the sequence, - between two characters denotes the inclusive range. IF * follows one of the pattern parts, then the corresponding input may appear 0 or more times.

Patterns Counted ^ at the beginning of a pattern represents the beginning of an input line. $ at the end of a pattern represents the end of the input line. \ is used as escape character. “ “ represent for a string of patterns.

Examples “for”reserved word for “--”decrement operator [A-AA-z_][A-Sea-z0-9_]* C identifiers “/*”.*”*/”Single line comments “//”.*C++ comments [0-9][0-9]*Integer constants “/*”([^*/]|[^*]”/”|”*”[^/])*”*/” C Comments over many lines. \”([^”\n]|\\[“\n])*\” Strings

Ambiguities Lex always chooses the pattern which represents the longest possible input string. If two patterns represent the same string, the first pattern in the list presented to lex is chosen. use: int [a-z]+

Sample Lex Programs 1) %{ /* Remove uppercase letters. Commands to execute are lex test.l and gcc lex.yy.c -ll -o test */ %} % [A-Z]+; 2) %{ /* Line numbering */ %} % ^.*\nprintf(“%d\t%s”,yylineno-1,yytext);

Sample Lex Programs contd %{ /* unix utility wc simulated. counts chars words and lines*/ %} int nchar,nword,nlines; % \nnchar++;nlines++; [^ \t\n]+{nword++;nchar+=yyleng; /*yyleng gives the length of the pattern*/}.nchar++; % void main(void) { yylex(); printf(“%d\t%d\t%d\n”,nchar,nword,nlines); }

Applications Pattern Matching Problem: Given a pattern string p and a subject string s, find out whether p appears in s as a substring. This is an important search problem. See Exercises 3.26 and The trick is to avoid O(|p|*|s|) algorithm.

Applications-contd Construct a DFA for the pattern. The back-transitions are constructed using failure functions. e.g., pattern string is: a b a b a a.

Applications - Contd Compute the edit distance between two given strings x and y. The edit operations that are allowed : insert, delete and update. (See exercise 3.35) e.g., if two strings are rational and nation, the edit distance will be 3.

Applications - Contd A Dynamic Programming algorithm can be used to compute edit distance. Let D[i,j] be the edit distance between x_1,…x_i and y_1,…,y_j. D[i,j]= min{ D[i-1,j-1]+replac(x_i,y_j), D[i-1,j]+1,D[i,j-1]+1} D[i-1,j]+1,D[i,j-1]+1}

Administration We have finished Chapter 3 of Aho, Sethi and Ullman’s book. Please read that chapter and chapter 1 which we covered in Lectures1 and 2. We have finished Chapter 3 of Aho, Sethi and Ullman’s book. Please read that chapter and chapter 1 which we covered in Lectures1 and 2. Work out the unstarred exercises of chapter 3. Work out the unstarred exercises of chapter 3. Lex and Yacc Manuals are handed out. Please read them. Lex and Yacc Manuals are handed out. Please read them.

First Project is in the web. It consists of three parts. 1) To write a lex program 2) To write a YACC program. 3) To write five sample Java programs. They can be either applets or application programs

Comments andFeedback Please let me know if you have not found a project partner. Please let me know if you have not found a project partner. A sample Java compiler is in the class home page. A sample Java compiler is in the class home page.