Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.

Slides:



Advertisements
Similar presentations
4b Lexical analysis Finite Automata
Advertisements

1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 2 Mälardalen University 2005.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
1 Chapter 2: Scanning 朱治平. Scanner (or Lexical Analyzer) the interface between source & compiler could be a separate pass and places its output on an.
1 The scanning process Goal: automate the process Idea: –Start with an RE –Build a DFA How? –We can build a non-deterministic finite automaton (Thompson's.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
Scanner Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source language? Is the.
Regular Expressions (RE) Empty set Φ A RE denotes the empty set Empty string λ A RE denotes the set {λ} Symbol a A RE denotes the set {a} Alternation M.
Topic #3: Lexical Analysis
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
Finite-State Machines with No Output
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions.
Overview of Previous Lesson(s) Over View  Strategies that have been used to implement and optimize pattern matchers constructed from regular expressions.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
Lexical Analysis Constructing a Scanner from Regular Expressions.
Topic #3: Lexical Analysis EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Lexical Analyzer (Checker)
Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Pembangunan Kompilator.  A recognizer for a language is a program that takes a string x, and answers “yes” if x is a sentence of that language, and.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Lecture # 12. Nondeterministic Finite Automaton (NFA) Definition: An NFA is a TG with a unique start state and a property of having single letter as label.
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2007.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
using Deterministic Finite Automata & Nondeterministic Finite Automata
Chapter 5 Finite Automata Finite State Automata n Capable of recognizing numerous symbol patterns, the class of regular languages n Suitable for.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
 2004 SDU Lecture4 Regular Expressions.  2004 SDU 2 Regular expressions A third way to view regular languages. Say that R is a regular expression if.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2006.
Deterministic Finite Automata Nondeterministic Finite Automata.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Compilers Lexical Analysis 1. while (y < z) { int x = a + b; y += x; } 2.
COMP 3438 – Part II - Lecture 3 Lexical Analysis II Par III: Finite Automata Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Finite automate.
Chapter 3 Lexical Analysis.
Lexical analysis Finite Automata
Compilers Welcome to a journey to CS419 Lecture5: Lexical Analysis:
Two issues in lexical analysis
Recognizer for a Language
Chapter 2 FINITE AUTOMATA.
Non-Deterministic Finite Automata
Recognition of Tokens.
Finite Automata.
4b Lexical analysis Finite Automata
Finite Automata & Language Theory
Chapter 3. Lexical Analysis (2)
Compiler Construction
4b Lexical analysis Finite Automata
Chapter 1 Regular Language
Presentation transcript:

Overview of Previous Lesson(s)

Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description of the form that the lexemes of a token may take.  In the case of a keyword as a token, the pattern is just the sequence of characters that form the keyword.  A lexeme is a sequence of characters in the source program that matches the pattern for a token and is identified by the lexical analyzer as an instance of that token. 3

Over View..  A regular expression is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings.  The idea is that the regular expressions over an alphabet consist of the alphabet, and expressions using union, concatenation, and *, but it takes more words to say it right.  Each regular expression r denotes a language L(r), which is also defined recursively from the languages denoted by r's subexpressions. 4

Over View…  As an intermediate step in the construction of a lexical analyzer, we first convert patterns into stylized flowcharts, called "transition diagrams”.  Transition diagrams have a collection of nodes or circles, called states  Each state represents a condition that could occur during the process of scanning the input looking for a lexeme that matches one of several patterns.  Edges are directed from one state of the transition diagram to another.  Each edge is labeled by a symbol or set of symbols. 5

Over View…  A transition diagram that recognizes the lexemes matching the token relop. 6

Over View…  The transition diagram for token number 7

Over View…  Finite automata are like the graphs in transition diagrams but they simply decide if a sentence (input string) is in the language (generated by our regular expression).  Finite automata are recognizers, they simply say "yes" or "no" about each possible input string.  Deterministic finite automata (DFA) have, for each state, and for each symbol of its input alphabet exactly one edge with that symbol leaving that state.  So if you know the next symbol and the current state, the next state is determined. That is, the execution is deterministic, hence the name. 8

Over View…  Nondeterministic finite automata (NFA) have no restrictions on the labels of their edges. A symbol can label several edges out of the same state, and ɛ, the empty string, is a possible label.  Both deterministic and nondeterministic finite automata are capable of recognizing the same languages. 9

Over View…  Transition graph for an NFA recognizing the language of regular expression (a | b) * abb 10 Transition Table for (a | b) * abb

11

Contents  Acceptance of Input Strings by Automata  Deterministic Finite Automata  Simulating a DFA  Regular Expressions to Automata  Conversion of an NFA to a DFA 12

Acceptance of Input Strings  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.  These symbols may specify several paths, some of which lead to accepting states and some that don't.  In such a case the NFA does accept the string, one successful path is enough.  If an edge is labeled ε, then it can be taken for free. 13

Acceptance of Input Strings..  Ex. Reconsider the following TG  Now we will see how string aabb is accepted by the NFA. 14

Acceptance of Input Strings...  Ex. Reconsider the following TG  Now we will see how string aabb is accepted by the NFA. 15

Acceptance of Input Strings…  One more path leads to aabb 16

Acceptance of Input Strings…  One more path leads to aabb  This path leads to state 0, which is not accepting.  NFA only accepts a string as long as some path labeled by that string leads from the start state to an accepting state.  The existence of other paths leading to a non accepting state is irrelevant. 17

Deterministic Finite Automata  A deterministic finite automaton (DFA) is a special case of an NFA where:  There are no moves on input ε, and  For each state S and input symbol a, there is exactly one edge out of s labeled a.  If we are using a transition table to represent a DFA, then each entry is a single state.  we may therefore represent this state without the curly braces that we use to form sets. 18

Simulating a DFA  NFA is an abstract representation of an algorithm to recognize the strings of a certain language but the DFA is a simple, concrete algorithm for recognizing strings.  It is fortunate indeed that every regular expression and every NFA can be converted to a DFA accepting the same language.  Now we will see an algorithm that shows how to apply a DFA to a string. 19

Simulating a DFA.. 20 Apply this Algorithm to the input string x

Simulating a DFA…  The function move(s, c) gives the state to which there is an edge from state s on input c.  The function next Char returns the next character of the input string x.  Ex. Transition graph of a DFA accepting the language (a|b)*abb, same as that accepted by the NFA previously. 21

RE to Automata  The regular expression is the notation of choice for describing lexical analyzers and other pattern-processing software.  Implementation of that software requires the simulation of a DFA, or perhaps simulation of an NFA.  NFA often has a choice of move on an input symbol or on ε, or even a choice of making a transition ε on or on a real input symbol.  Its simulation is less straightforward than for a DFA.  So it is important to convert an NFA to a DFA that accepts the same language. 22

Conversion of NFA to DFA  The general idea behind the subset construction is that each state of the constructed DFA corresponds to a set of NFA states.  PROCEDURE: INPUT: An NFA N OUTPUT: A DFA D accepting the same language as N. METHOD:C onstruct a transition table Dtran for D. Each state of D is a set of NFA states, and construct Dtran so D will simulate "in parallel" all possible moves N can make on a given input string. 23

Conversion of NFA to DFA..  First issue is to deal with ɛ-transitions of N properly.  Definitions of several functions that describe basic computations on the states of N that are needed in the algorithm are described below:  Here s is a single state of N, while T is a set of states of N. 24

Conversion of NFA to DFA...  As a basis, before reading the first input symbol, N can be in any of the states of ɛ - closure(s 0 ), where S 0 is its start state.  For the induction, suppose that N can be in set of states T after reading input string x.  If it next reads input a, then N can immediately go to any of the states in move(T,a ).  After reading a, it may also make several ɛ-transitions, thus N could be in any state of ɛ-closure(move(T,a ) after reading input xa. 25

Conversion of NFA to DFA...  Ex. Let us consider the following transition graph, which is an NFA that accepts strings satisfying the regular expression (a|b) * abb. The alphabet is {a,b} 26

Conversion of NFA to DFA...  The start state of D is the set of N-states that can result when N processes the empty string ε.  This is called the ε-closure of the start state s 0 of N, and consists of those N-states that can be reached from s 0 by following edges labeled with ε.  Calculation of ɛ-closure(0) or D

Conversion of NFA to DFA...  Calculation of D 0 28

Conversion of NFA to DFA...  The start state of D is the set of N-states that can result when N processes the empty string ε.  This is called the ε-closure of the start state s 0 of N, and consists of those N-states that can be reached from s 0 by following edges labeled with ε. ɛ-closure(0) = D 0 = {0,1,2,4,7}  We call this state D 0 and enter it in the transition table 29 baDFA StatesNFA States D0D0 {0,1,2,4,7}

Conversion of NFA to DFA...  Next we want the a-successor of D 0, i.e., the D-state that occurs when we start at D 0 and move along an edge labeled a.  We call this successor D 1.  Since D 0 consists of the N-states corresponding to ε, D 1 is the N-states corresponding to εa=a.  We compute the a-successor of all the N-states in D 0 and then form the ε-closure. ɛ-closure(move(A,a) = D 1 = ? 30

Conversion of NFA to DFA...  Calculation of D 1 : ɛ-closure(move(A,a) = ɛ-closure(move({0,1,2,4,7},a) 31

Conversion of NFA to DFA...  Next we want the a-successor of D 0, i.e., the D-state that occurs when we start at D 0 and move along an edge labeled a.  We call this successor D 1.  Since D 0 consists of the N-states corresponding to ε, D 1 is the N-states corresponding to εa=a.  We compute the a-successor of all the N-states in D 0 and then form the ε-closure. ɛ-closure(move(A,a) = D 1 = {1,2,3,4,6,7,8} 32

Conversion of NFA to DFA...  Now Transition Table is.  Next we compute the b-successor of D 0 the same way and call it D baDFA StatesNFA States D1D1 D0D0 {0,1,2,4,7} D1D1 {1,2,3,4,6,7,8}

Conversion of NFA to DFA...  Calculation of D 2 : ɛ-closure(move(D 0,b) = ɛ-closure(move({0,1,2,4,7},b) 34

Conversion of NFA to DFA...  Now Transition Table is. 35 baDFA StatesNFA States D2D2 D1D1 D0D0 {0,1,2,4,7} D1D1 {1,2,3,4,6,7,8} D2D2 {1,2,4,5,6,7}

Conversion of NFA to DFA...  We continue forming a- and b-successors of all the D-states until no new D-states result.  So the final transition table is 36 baDFA StatesNFA States D2D2 D1D1 D0D0 {0,1,2,4,7} D3D3 D1 D1 D1D1 {1,2,3,4,6,7,8} D2D2 D1D1 D2D2 {1,2,4,5,6,7} D4D4 D1D1 D3D3 {1,2,4,5,6,7,9} D2D2 D1D1 D4D4 {1,2,4,5,6,7,10}

Conversion of NFA to DFA...  So after applying this result on the NFA we got 37

Thank You