CS30003: Compilers Lexical Analysis Lecture Date: 05/08/13 Submission By: DHANJIT DAS, 11CS10012.

Slides:



Advertisements
Similar presentations
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
Advertisements

Chapter 3 Lexical Analysis Yu-Chen Kuo.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
CPSC Compiler Tutorial 9 Review of Compiler.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source program) – divides it into tokens.
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
 We are given the following regular definition: if -> if then -> then else -> else relop -> |>|>= id -> letter(letter|digit)* num -> digit + (.digit.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Course Revision Contents  Compilers  Compilers Vs Interpreters  Structure of Compiler  Compilation Phases  Compiler Construction Tools  A Simple.
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Lexical Analysis Natawut Nupairoj, Ph.D.
CSC 338: Compiler design and implementation
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Lexical Analysis Hira Waseem Lecture
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
___________________________________________ COMPILER Theory___________________________________________ Fourth Year (First Semester) Dr. Hamdy M. Mousa.
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
CS 536 Fall Scanner Construction  Given a single string, automata and regular expressions retuned a Boolean answer: a given string is/is not in.
Lexical Analyzer in Perspective
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
What on Earth? LEXEMETOKENPATTERN print p,r,i,n,t (leftpar( 4number4 *arith* 5number5 )rightpar) userAnswerID Letter followed by letters and digits “Game.
1.  It is the first phase of compiler.  In computer science, lexical analysis is the process of converting a sequence of characters into a sequence.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
Chapter 1 Introduction Major Data Structures in Compiler
Joey Paquet, 2000, Lecture 2 Lexical Analysis.
Lexical Analysis S. M. Farhad. Input Buffering Speedup the reading the source program Look one or more characters beyond the next lexeme There are many.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Scanner Introduction to Compilers 1 Scanner.
Overview of Previous Lesson(s) Over View  Syntax-directed translation is done by attaching rules or program fragments to productions in a grammar. 
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
The Role of Lexical Analyzer
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1st Phase Lexical Analysis
Compilers Computer Symbol Table Output Scanner (lexical analysis)
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Compiler Chapter 4. Lexical Analysis Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
System Software Theory (5KS03).
Lexical Analyzer in Perspective
Compiler Designs and Constructions (Page 83 – 92)
Lexical and Syntax Analysis
A Simple Syntax-Directed Translator
Scanner Scanner Introduction to Compilers.
Chapter 3 Lexical Analysis.
Lecture 5 Transition Diagrams
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Compilers Welcome to a journey to CS419 Lecture5: Lexical Analysis:
PROGRAMMING LANGUAGES
-by Nisarg Vasavada (Compiled*)
Compiler Construction
Lexical analysis Jakub Yaghob
Chapter 3: Lexical Analysis
Review: Compiler Phases:
Scanner Scanner Introduction to Compilers.
Scanner Scanner Introduction to Compilers.
Scanner Scanner Introduction to Compilers.
Scanner Scanner Introduction to Compilers.
Scanner Scanner Introduction to Compilers.
Presentation transcript:

CS30003: Compilers Lexical Analysis Lecture Date: 05/08/13 Submission By: DHANJIT DAS, 11CS10012

What are Lexemes? Before understanding “lexical analysis” let's understand what is a Lexeme in brief ■ Lexemes are a stream of characters which can be grouped together based on a specific pattern. ■ Patterns are the description that lexemes can represent or can take. ■ Example: if var < tmp*6 What are the lexemes here??

Find lexemes: If var < tmp*6 If← keyword var ← identifier < ← operator (logical) tmp ← identifier 6 ← constant ● Note: Space is discarded. In most compilers, spaces are stripped out.

Token, Patterns... and Lexemes ● Generally, there are a set of string in input for which same token is produced as output. ● Patterns is a rule that matches each string of this set. ● Lexeme is a sequence of characters in source program that is matched by pattern for a token. ● So, 'if' ← lexeme ; 'keyword' ← token ; 'i-f- ' ← pattern

TokensSample LexemesPatterns (informal description) enum for identifiercount, flag, varletter followed by letters and digits num3.1416, 2, 0a numeric constant literal“segmentation fault”any characters between two qoutation marks. Source code is a collection of lexemes The collection/pattern of lexemes is defined by the programming language.

Token Tuple ● From lexemes we construct tokens. ● Token is a tuple of two elements, but may be of only one element. {token_name, attribute} symbolic representation optional of a specific lexeme ● Example: 'if' ← when identified, set 'token_name' as 'if' and no attribute for keywords.

● When lexical analyser encounters lexeme, it generates the token_name and fills up the attribute with the name, type, etc.. from the symbol table. ● Attribute will point to the entry in the symbol table, or memory. ● Numeric Constants: token can be represented in three ways → ■ ■ ← where “ptr” is pointer to the number stored in memory

Lexical Anyalyser – Parser relationship. ● Lexical Analyser does not read the source code in entire go. ● Produced tokens are held in a buffer until they are consumed by parser. ● LA cannot proceed when buffer is full and parser cannot proceed when buffer is empty. Source Code Parser Lexical Analyser

Lexical Analyser Symbol Table Parser get next token token ● The schematic diagram is commonly implemented by making the lexical analyser a subroutine of the parser. ● Upon receiving a “get next token” command from the parser, the lexical analyser reads input characters until it can identify next token.

If var < temp*6 Lexical Analyser will first read “if”. match keyword generate token ● NOTE: Read next character also. Example: ifex = 5 ← ifex not a keyword and lack of space is a error!! So, should scan next character also.

● Lexical Analyser reads one data block In one go, lexical analyser will read one data block from source code. ● What is data block? A block is a sequence of bytes or bits, having a nominal length (a block size). Data thus structured are said to be blocked. ● Blocking is used to facilitate the handling of the data-stream by the computer program receiving the data, in this case the lexical analyser.

Forward and Begin Pointer ● Two pointers to the input buffer are maintained. ● The string of characters between the two pointers is the current lexeme. ● Forward pointer: Scans ahead until a match for a pattern is found. If lexeme found, 'forward pointer' set to next character to its right. ● Begin pointer: marks the beginning of the current lexeme being searched for a match.

wh begin pointer forward pointer eli “while” is the string between the forward and begin pointer. Once “while” is matched to symbol table, token can be generated. Next character also needs to be scanned

END OF THIS LECTURE Date: 05/08/13