CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Slides:



Advertisements
Similar presentations
Recursive Descent Technique CMSC 331. UMBC 2 The Header /* This program matches the following A -> B { '|' B } B -> C { '&' C } C -> D { '^' D } D ->
Advertisements

Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
Lexical Analysis (4.2) Programming Languages Hiram College Ellen Walker.
1 IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are several ways to translate either a DFA or an NFA into code. Consider, again the example of a DFA that.
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #4 Lexical.
Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1.
Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
Lexical Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 2.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Compiler Construction Lexical Analysis Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Lexical Analysis Textbook:Modern Compiler Design Chapter 2.1.
Scanning with Jflex.
Lecture 2: Lexical Analysis CS 540 George Mason University.
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
1 Material taught in lecture Scanner specification language: regular expressions Scanner generation using automata theory + extra book-keeping.
Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source program) – divides it into tokens.
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
Applications of Regular Expressions BY— NIKHIL KUMAR KATTE 1.
CS 536 Spring Learning the Tools: JLex Lecture 6.
Compiler Construction Lexical Analysis Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
CMSC 331, Some material © 1998 by Addison Wesley Longman, Inc. 1 Chapter 4 Chapter 4 Lexical analysis.
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Control Structures A control structure is simply a pattern for controlling the flow of a program module. The three fundamental control structures of a.
Lexical Analysis Mooly Sagiv Schrierber Wed 10:00-12:00 html:// Textbook:Modern.
Winter Compiler Construction T2 – Lexical Analysis (Scanning) Mooly Sagiv and Roman Manevich School of Computer Science Tel-Aviv University.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lecture 2: Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – JLex Scanner Generator.
Lexical and Syntax Analysis
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
Regular Expressions.
CS412/413 Introduction to Compilers Radu Rugina Lecture 4: Lexical Analyzers 28 Jan 02.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
CSE 5317/4305 L2: Lexical Analysis1 Lexical Analysis Leonidas Fegaras.
A First Simple Program /* This is a simple Java program. Call this file "Example.java".*/ class Example { // Your program begins with a call to main().
CS 536 Fall Scanner Construction  Given a single string, automata and regular expressions retuned a Boolean answer: a given string is/is not in.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
By Neng-Fa Zhou Lexical Analysis 4 Why separate lexical and syntax analyses? –simpler design –efficiency –portability.
Compiler Construction Lexical Analysis. 2 Administration Project Teams Project Teams Send me your group Send me your group
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Joey Paquet, 2000, Lecture 2 Lexical Analysis.
Compiler Construction Sohail Aslam Lecture 9. 2 DFA Minimization  The generated DFA may have a large number of states.  Hopcroft’s algorithm: minimizes.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Scanner Introduction to Compilers 1 Scanner.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
C Chuen-Liang Chen, NTUCS&IE / 35 SCANNING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
©SoftMoore ConsultingSlide 1 Lexical Analysis (a.k.a. Scanning)
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Finite-State Machines (FSMs)
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University
RegExps & DFAs CS 536.
Finite-State Machines (FSMs)
JLex Lecture 4 Mon, Jan 26, 2004.
Lexical Analysis Why separate lexical and syntax analyses?
Compiler Structures 2. Lexical Analysis Objectives
Presentation transcript:

CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Constructing a Lexical Analyzer state = S // S is the start state repeat { k = next character from the input if k == EOF // the end of input if state is a final state then accept else reject state = T[state,k] if state = empty then reject // got stuck }

Constructing a Lexical Analyzer

int LexAnalyzer() { getChar(); if (isLetter(nextChar)) { addChar(); getChar(); while (isLetter(nextChar) || isDigit(nextChar)) { addChar(); getChar(); } return lookup(lexeme); }...

Constructing a Lexical Analyzer int LexAnalyzer() { getChar(); if (isLetter(nextChar)) {... } else if (isDigit(nextChar)) { addChar(); getChar(); while (isDigit(nextChar)) { addChar(); getChar(); } return INT_LIT; break; }

Lexical Errors Consider the following two programs:

Lexical Errors

Jlex: a scanner generator JLex.Main (java) javac P.main (java) jlex specification xxx.jlex xxx.jlex.java generated scanner xxx.jlex.java Yylex.class input program test.sim Output of P.main

public class P { public static void main(String[] args) { FileReader inFile = new FileReader(args[0]); Yylex scanner = new Yylex(inFile); Symbol token = scanner.next_token(); while (token.sym != sym.EOF) { switch (token.sym) { case sym.INTLITERAL: System.out.println("INTLITERAL (" + ((IntLitTokenVal)token.value).intVal \ + ")"); break; … } token = scanner.next_token(); } Jlex: a scanner generator

Regular expression rules regular-expression { action } pattern to be matchedcode to be executed when the pattern is matched When next_token() method is called, it repeats: Find the longest sequence of characters in the input (starting with the current character) that matches a pattern. Perform the associated action until a return in an action is executed.

Matching rules If several patterns that match the same sequence of characters, then the longest pattern is considered to be matched. If several patterns that match the same (longest) sequence of characters, then the first such pattern is considered to be matched so the order of the patterns can be important! If an input character is not matched in any pattern, the scanner throws an exception

An Example % DIGIT= [0-9] LETTER= [a-zA-Z] WHITESPACE= [ \t\n] // space, tab, newline {LETTER}({LETTER}|{DIGIT}*) {System.out.println(yyline+1 + ": ID " + yytext());} {DIGIT}+ {System.out.println(yyline+1 + ": INT");} "=" {System.out.println(yyline+1 + ": ASSIGN");} "==" {System.out.println(yyline+1 + ": EQUALS");} {WHITESPACE}* { }. {System.out.println(yyline+1 + ": bad char");}