COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.

Slides:



Advertisements
Similar presentations
Finite-State Machines with No Output Ying Lu
Advertisements

4b Lexical analysis Finite Automata
CSC 361NFA vs. DFA1. CSC 361NFA vs. DFA2 NFAs vs. DFAs NFAs can be constructed from DFAs using transitions: Called NFA- Suppose M 1 accepts L 1, M 2 accepts.
CPSC Compiler Tutorial 4 Midterm Review. Deterministic Finite Automata (DFA) Q: finite set of states Σ: finite set of “letters” (input alphabet)
Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Regular Expressions Finite State Automaton. Programming Languages2 Regular expressions  Terminology on Formal languages: –alphabet : a finite set of.
Summary Showing regular Showing non-regular construct DFA, NFA
Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
1 The scanning process Main goal: recognize words/tokens Snapshot: At any point in time, the scanner has read some input and is on the way to identifying.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
1 Single Final State for NFAs and DFAs. 2 Observation Any Finite Automaton (NFA or DFA) can be converted to an equivalent NFA with a single final state.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Nondeterminism.
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
1 Non-Deterministic Automata Regular Expressions.
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
1.Defs. a)Finite Automaton: A Finite Automaton ( FA ) has finite set of ‘states’ ( Q={q 0, q 1, q 2, ….. ) and its ‘control’ moves from state to state.
CS Chapter 2. LanguageMachineGrammar RegularFinite AutomatonRegular Expression, Regular Grammar Context-FreePushdown AutomatonContext-Free Grammar.
Fall 2004COMP 3351 Another NFA Example. Fall 2004COMP 3352 Language accepted (redundant state)
Costas Busch - LSU1 Non-Deterministic Finite Automata.
1 A Single Final State for Finite Accepters. 2 Observation Any Finite Accepter (NFA or DFA) can be converted to an equivalent NFA with a single final.
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
Regular Expressions (RE) Empty set Φ A RE denotes the empty set Empty string λ A RE denotes the set {λ} Symbol a A RE denotes the set {a} Alternation M.
Topic #3: Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
Finite-State Machines with No Output
Theory of Languages and Automata
어휘분석 (Lexical Analysis). Overview Main task: to read input characters and group them into “ tokens. ” Secondary tasks: –Skip comments and whitespace;
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
1 Chapter 2 Finite Automata (part b) Windmills in Holland.
Lexical Analyzer (Checker)
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
DFA Minimization 1 2 Equivalent States Consider the accept states c and g. They are both sinks meaning that any string which ever reaches them is guaranteed.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Lecture # 12. Nondeterministic Finite Automaton (NFA) Definition: An NFA is a TG with a unique start state and a property of having single letter as label.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2007.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
MINIMIZATION May 12, Agenda 2  These slides are not mine www1.cs.solumbia.edu/~zeph/3261/L9/L9.ppt www1.cs.solumbia.edu/~zeph/3261/L9/L9.ppt.
using Deterministic Finite Automata & Nondeterministic Finite Automata
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
Algorithms for hard problems Automata and tree automata Juris Viksna, 2015.
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
Complexity and Computability Theory I Lecture #5 Rina Zviel-Girshin Leah Epstein Winter
1 Lexical Analysis Uses formalism of Regular Languages Uses formalism of Regular Languages Regular Expressions Regular Expressions Deterministic Finite.
Deterministic Finite Automata Nondeterministic Finite Automata.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
1/29/02CSE460 - MSU1 Nondeterminism-NFA Section 4.1 of Martin Textbook CSE460 – Computability & Formal Language Theory Comp. Science & Engineering Michigan.
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Finite automate.
Non Deterministic Automata
Chapter 2 FINITE AUTOMATA.
Some slides by Elsa L Gunter, NJIT, and by Costas Busch
Non-Deterministic Finite Automata
Non Deterministic Automata
Chapter Five: Nondeterministic Finite Automata
Chapter 1 Regular Language
Lecture 5 Scanning.
Presentation transcript:

COMP3190: Principle of Programming Languages DFA and its equivalent, scanner

- 1 - Outline v DFA & NFA »DFA »NFA »NFA →DFA »Minimize DFA v Regular expression v Regular languages v Scanner

- 2 - Example of DFA q1 q δ01 q1 q2 q1q2

- 3 - Deterministic Finite Automata (DFA) v 5-tuple: »Q: finite set of states »Σ: finite set of “letters” (alphabet) »δ: Q × Σ → Q (transition function) »q 0 : start state (in Q) »F : set of accept states (subset of Q) v Acceptance: Given an input string, it is consumed with the automata in a final state.

- 4 - Another Example of a DFA S q1 q2 r1 r2 a b a ab b b ab a

- 5 - Outline v DFA & NFA »DFA »NFA »NFA →DFA »Minimize DFA v Regular expression v Regular languages v Context free languages &PDA v Scanner v Parser

- 6 - Non-deterministic Finite Automata (NFA) Transition function is different v δ: Q × Σ ε → P(Q) v P(Q) is the powerset of Q (set of all subsets) v Σ ε is the union of Σ and the special symbol ε (denoting empty) String is accepted if there is at least one path leading to an accept state, and input consumed.

- 7 - Example of an NFA q1q2q3q4 0, 1 1 0, ε1 0, 1 δ01ε q1{q1}{q1, q2} q2{q3} q3{q4} q4{q4} What strings does this NFA accept?

- 8 - Outline v DFA & NFA »DFA »NFA »NFA →DFA »Minimize DFA v Regular expression v Regular languages v Context free languages &PDA v Scanner v Parser

- 9 - Converting an NFA to a DFA v For set of states S,  - closure(S) is the set of states that can be reached from S without consuming any input. v For a set of states S, DFAedge(s, c) is the set of states that can be reached from S by consuming input symbol c. v Each set of NFA states corresponds to one DFA state (hence at most 2 n states).

v  -closure({1})={1 , 2}=I J={5 , 4 , 3}  -closure(J)=  -closure({5 , 4 , 3}) ={5 , 4 , 3 , 6 , 2 , 7 , 8} 6 1 a  a a    

IIaIa IbIb {X,5,1}{5,3,1}{5,4,1} {5,3,1}{5,2,3,1,6,Y}{5,4,1} {5,3,1}{5,2,4,1,6,Y} {5,2,3,1,6,Y} {5,4,6,1,Y} {5,3,6,1,Y}{5,2,4,1,6,Y} {5,3,6,1,Y}{5,2,4,1,6,Y} {5,3,6,1,Y}{5,2,3,1,6,Y}{5,4,6,1,Y} X Y  a b   a b a b a b

Iab a ab b b a b a a b a b a b

Convert NFA to DFA v NFA DFA

NFA to DFA Exercises v Convert the following NFA’s to DFA’s 15 possible states on this second one (might be easier to represent in table format)

Outline v DFA & NFA »DFA »NFA »NFA →DFA »Minimize DFA v Regular expression v Regular languages v Scanner

Equivalent States. Example 16 Consider the accept states c and g. They are both sinks meaning that any string which ever reaches them is guaranteed to be accepted later. Q: Do we need both states? a b 1 0,1 e c g f

Equivalent States. Example 17 A: No, they can be unified as illustrated below. Q: Can any other states be unified because any subsequent string suffixes produce identical results? a b 1 0,1 e cg f

Equivalent States. Example 18 A: Yes, b and f. Notice that if you’re in b or f then: 1. if string ends, reject in both cases 2. if next character is 0, forever accept in both cases 3. if next character is 1, forever reject in both cases So unify b with f. a b 1 0,1 e cg f

Equivalent States. Example 19 Intuitively two states are equivalent if all subsequent behavior from those states is the same. Q: Come up with a formal characterization of state equivalence. a 0,1 d e 1 cg bf 0 0 1

Obtaining the minimal equivalent DFA v Initially two equivalence classes: accept and nonaccept states.  Search for an equivalence class C and an input letter a such that with a as input, the states in C make transitions to states in k>1 different equivalence classes. v Partition C into k classes accordingly v Repeat until unable to find a class to partition.

Minimization Example 21 Split into two teams. ACCEPT vs. NONACCEPT

Minimization Example 22 0-label doesn’t split up any teams

Minimization Example 23 1-label splits up NONACCEPT's

Minimization Example 24 No further splits. HALT! Start team contains original start

Minimization Example. End Result 25 States of the minimal automata are remaining teams. Edges are consolidated across each team. Accept states are break-offs from original ACCEPT team.

Minimization Example. Compare

a e b f c g d h Class Exercise

Outline v DFA & NFA v Regular expression v Regular languages v Scanner

Regular Expressions R is a regular expression if R is v “a” for some a in Σ. v ε (the empty string). v member of the empty language. v the union of two regular expressions. v the concatenation of two regular expr. v R 1 * (Kleene closure: zero or more repetitions of R 1 ).

Examples of Regular Expressions {0, 1}* 0 all strings that end in 0 {0, 1} 0* string that start with 1 or 0 followed by zero or more 0s. {0, 1}* all strings {0 n 1 n, n >=0} not a regular expression!!!

Regular Expressions in Java v Ex: pattern match. v Is text in the set described by the pattern? v public class RE { public static void main(String[] args) { String pattern = args[0]; String text = args[1]; System.out.println(text.matches(pattern)); } v % java RE "..oo.oo." bloodroot true v % java RE "[$_A-Za-z][$_A-Za-z0-9]*" ident123 true v % java RE true

Regular Expression Notation in Java v a: an ordinary letter v ε: the empty string v M | N: choosing from M or N v MN: concatenation of M and N v M*: zero or more times (Kleene star) v M + : one or more times v M?: zero or one occurence v [a-zA-Z] character set alternation (choice) v. period stands for any single char exc. newline

Converting a regular expression to a NFA v Empty string v Single character v union operator v Concatenation v Kleene closure

Regular expression→NFA Language: Strings of 0s and 1s in which the number of 0s is even Regular expression: (1*01*0)*1*

NFA → DFA Initial classes: {A, B, E}, {C, D} No class requires partitioning! Hence a two-state DFA is obtained.

Minimize DFA

Outline v DFA & NFA v Regular expression v Regular languages v Scanner

Regular language v a formal language »a set of finite sequences of symbols from a finite alphabet v it can be generated by a regular grammar

Regular Grammar v Later definitions build on earlier ones v Nothing defined in terms of itself (no recursion) Regular grammar for numeric literals in Pascal: digit → 0|1|2|...|8|9 unsigned_integer → digit digit* unsigned_number → unsigned_integer ((. unsigned_integer) | ε ) (( e (+ | - | ε ) unsigned_integer ) | ε )

Important Theorems v A language is regular if a regular expression describes it. v A language is regular if a finite automata recognizes it. v DFAs and NFAs are equally powerful.

Outline v DFA & NFA v Regular expression v Regular languages v Scanner

Scanning v Accept the longest possible token in each invocation of the scanner. v Implementation. »Capture finite automata.  Case(switch) statements.  Table and driver.

Scanner for Pascal

Scanner for Pascal(case Statements)

Scanner Generators v Start with a regular expression. v Construct an NFA from it. v Use a set of subsets construction to obtain an equivalent DFA. v Construct the minimal equivalent DFA.