Regular Expressions 15-211 Fundamental Data Structures and Algorithms Peter Lee March 13, 2003.

Slides:



Advertisements
Similar presentations
Chapter Three: Closure Properties for Regular Languages
Advertisements

CSE 311 Foundations of Computing I
Finite Automata CPSC 388 Ellen Walker Hiram College.
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
NFAs Sipser 1.2 (pages 47–54). CS 311 Fall Recall… Last time we showed that the class of regular languages is closed under: –Complement –Union.
1 Introduction to Computability Theory Lecture12: Decidable Languages Prof. Amos Israeli.
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
NFAs Sipser 1.2 (pages 47–54). CS 311 Mount Holyoke College 2 Recall… Last time we showed that the class of regular languages is closed under: –Complement.
Lecture 3UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 3.
1 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY (For next time: Read Chapter 1.3 of the book)
Costas Busch - RPI1 Single Final State for NFAs. Costas Busch - RPI2 Any NFA can be converted to an equivalent NFA with a single final state.
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
Fall 2006Costas Busch - RPI1 Regular Expressions.
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
Introduction to the Theory of Computation John Paxton Montana State University Summer 2003.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular.
Fall 2004COMP 3351 Single Final State for NFA. Fall 2004COMP 3352 Any NFA can be converted to an equivalent NFA with a single final state.
1 Single Final State for NFAs and DFAs. 2 Observation Any Finite Automaton (NFA or DFA) can be converted to an equivalent NFA with a single final state.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
Introduction to Finite Automata Adapted from the slides of Stanford CS154.
CS Chapter 2. LanguageMachineGrammar RegularFinite AutomatonRegular Expression, Regular Grammar Context-FreePushdown AutomatonContext-Free Grammar.
Costas Busch - LSU1 Non-Deterministic Finite Automata.
Lecture 4UofH - COSC Dr. Verma 1 COSC 3340: Introduction to Theory of Computation University of Houston Dr. Verma Lecture 4.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Automata and Regular Expression Discrete Mathematics and Its Applications Baojian Hua
1 A Single Final State for Finite Accepters. 2 Observation Any Finite Accepter (NFA or DFA) can be converted to an equivalent NFA with a single final.
Fall 2004COMP 3351 Regular Expressions. Fall 2004COMP 3352 Regular Expressions Regular expressions describe regular languages Example: describes the language.
Nondeterminism (Deterministic) FA required for every state q and every symbol  of the alphabet to have exactly one arrow out of q labeled . What happens.
CSE 311: Foundations of Computing Fall 2014 Lecture 23: State Minimization, NFAs.
Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { ‘{‘, ‘}’ } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite language,
REGULAR LANGUAGES.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
Prof. Busch - LSU1 NFAs accept the Regular Languages.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
CMSC 330: Organization of Programming Languages Finite Automata NFAs  DFAs.
Finite State Machines Concepts DFA NFA.
INHERENT LIMITATIONS OF COMPUTER PROGAMS CSci 4011.
CS 203: Introduction to Formal Languages and Automata
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
Transparency No. 2-1 Formal Language and Automata Theory Homework 2.
Nondeterministic Finite Automata (NFAs). Reminder: Deterministic Finite Automata (DFA) q For every state q in Q and every character  in , one and only.
Algorithms for hard problems Automata and tree automata Juris Viksna, 2015.
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
INHERENT LIMITATIONS OF COMPUTER PROGRAMS CSci 4011.
Finite-State Machines Fundamental Data Structures and Algorithms Peter Lee March 11, 2003.
1 Language Recognition (11.4) Longin Jan Latecki Temple University Based on slides by Costas Busch from the courseCostas Busch
CS 154 Formal Languages and Computability February 11 Class Meeting Department of Computer Science San Jose State University Spring 2016 Instructor: Ron.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
 2004 SDU Lecture4 Regular Expressions.  2004 SDU 2 Regular expressions A third way to view regular languages. Say that R is a regular expression if.
Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.
Complexity and Computability Theory I Lecture #5 Rina Zviel-Girshin Leah Epstein Winter
Finite Automata A simple model of computation. 2 Finite Automata2 Outline Deterministic finite automata (DFA) –How a DFA works.
1 Section 11.2 Finite Automata Can a machine(i.e., algorithm) recognize a regular language? Yes! Deterministic Finite Automata A deterministic finite automaton.
P Symbol Q E(Q) a b a b a b Convert to a DFA: Start state: Final States:
1 Finite Automata. 2 Introductory Example An automaton that accepts all legal Pascal identifiers: Letter Digit Letter or Digit "yes" "no" 2.
CSE 105 theory of computation
Single Final State for NFA
CSC 4170 Theory of Computation Nondeterminism Section 1.2.
Hierarchy of languages
4. Properties of Regular Languages
NFA vs DFA DFA: For every state q in S and every character  in , one and only one transition of the following form occurs:  q q’ NFA: For every state.
Language Recognition (12.4)
CSC 4170 Theory of Computation Nondeterminism Section 1.2.
Chapter 1 Regular Language
CSCI 2670 Introduction to Theory of Computing
Presentation transcript:

Regular Expressions Fundamental Data Structures and Algorithms Peter Lee March 13, 2003

Announcements  Homework #4 is due on Monday!  Monday, March 17, 11:59pm  Reading:  Handout (from last time)

Recap: FSMs

Finite State Machines (FSMs) Input String M {Yes, No} M = (, S, q0, F, ) Input alphabet State set Initial state Final states Transition function

 Can extend :S    S to ’:S  *  S ’(q, ) = q ’(q, aw) = ’((q, a), w) Transition functions A deterministic finite automaton (DFA) Inductively:

DFA example  Which strings of as and bs are accepted?  Transition function:  { (q0,a)  q1, (q0,b)  q0, (q1,a)  q2, (q1,b)  q1, (q2,a)  q2, (q2,b)  q2 } aa bba,b

Nondeterministic FSMs (NFAs)  NFAs can transition to more than one state on any input  :S    P(S)  As before, can extend:  ’:S  *  P(S)  Inductively: ’(q, ) = {q} ’(q, aw) =  p(q, a) ’(p, w)

NFA example 0 1 a,b ab b Transition function:  { (q0,a)  {q0,q1}, (q0,b)  {q1}, (q1,a)  , (q1,b)  {q0,q1} }

Questions 1. Are there languages L that can be accepted by NFAs but not DFAs? 2. What practical use are there for FSMs? No! Today: the proof. After the proof…

The Idea  An NFA can be in more than one state at a time  Define a DFA whose states correspond with all combinations of the NFA states

Another handy extension  Extend :S    P(S) to ’:S  *  P(S) to ’’:P(S)  *  P(S) ’’({q1,…qn}, w) =  ’(qi, w) 1 i n

NFA into a DFA example 0 1 a,b ab b In the DFA, construct these states: S = {[], [q0], [q1], [q0,q1]} Each state in the DFA represents a set of states in the NFA NFA:

NFA into a DFA example 0 1 a,b ab b DFA: S= {[], [q0], [q1], [q0,q1]} What is  for the DFA? ([],a) = [] and ([],b) = [] ([q0],a) = [q0,q1] ([q0],b) = [q1] ([q1],a) = [] ([q1],b) = [q0,q1] ([q0,q1],a) = [q0,q1] ([q0,q1],b) = [q0,q1] 0 0,1 a,b a b b 1

The theorem  Thm: Let L be a language accepted by an NFA. Then there exists a DFA that also accepts L.  Proof:  Let’s use the construction shown on the previous slides. We must prove that the DFA accepts the same language as the NFA.

The proof  More formally:  Let M = (, S, q0, F, ) be the NFA, and  M’ = (, S’, q0’, F’, ’) be the DFA.  We want to prove that, given any input string w, that  ’(q0’,w)=[qi,qj,…,qk] iff  (q0,w)={qi,qj,…,qk}

By induction (of course!)  Base case:  Trivial for the empty input string.  Induction hypothesis:  Assume true for all input strings of length n or less.

By induction…  Let wa be a string of length n+1. Then  ’(q0’,wa) = ’(’(q0’,w),a)  By the IH,  ’(q0’,w) = [qi,qj,…,qk] iff (q0,w) = {qi,qj,…,qk}  And by definition of ’  ’([qi,qj,…,qk],a) = [qa,qb,…,qc] iff ({qi,qj,…,qk},a) = {qa,qb,…,qc}  Thus,  ’(q0’,wa) = [qa,qb,…,qc] iff (q0, wa) = {qa,qb,…,qc} 

Regular Languages

Regular languages  The language accepted by M: L(M) = {w | ’(q0,w)  F}  Can also say:  The language recognized by M  The language decided by M  When M is a FSM, we say that the language is regular

Another question  Is the complement of a regular language also regular? L’ = * - L  Hint 1: Is there a way to construct a complement machine?  Hint 2: Consider the final states…

Closure properties  What about union?  Intersection?  Product?

A Digression

Cheating vs Collaboration

A scenario  Alice and Bob are excellent students.  There is virtually no doubt that they can easily do “A” work in  But even so, is a lot of work.  And the time required might be better spent in another course, which is harder, and possibly more important.

A scenario, cont’d  So, to save time, Alice and Bob decide to work together on the homeworks.  They work together and hand in essentially the same programs.  Alice writes a comment into her version of the code, explaining that she has collaborated with Bob.  Bob does not do this.

A scenario, cont’d  Did Alice cheat?  What about Bob?

A second scenario  Bob works very hard on his assignment  He gets everything working and hands it in 3 days early  He then discusses his solution with Alice  After discussing with Alice, Bob realizes that his solution is O(n 2 ), whereas the best solutions are O(nlog n)

Second scenario, cont’d  Bob uses this new knowledge and rewrites his assignment so that it runs in O(nlog n) time, and hands it in  Later, after further discussion with Alice, he realizes that his code, while acceptably fast, is still written poorly

Second scenario, cont’d  Bob has learned a lot already, but is concerned that his grade will not reflect his state of knowledge  Bob thus copies Alice’s code, makes some minor modifications, and hands it in  What has happened here?

Regular Expressions

 A regular language can always be described using a regular expression.  Examples  (01)*  00    (a|b)*ab  this|that|theother  0*1*2*  01*|0 = 01*  00*11*22* =  (1|0)*00(0|1)*

More examples  [.?!][\]\"')]*($|\t| )[ \t\n]*  [.?!][]"')]*($| |)[ ]*  Emacs regexp:  Any of. ? ! followed by  Zero or more of ] “ ‘ ) followed by  Any of end-of-line, tab, two spaces followed by  Zero or more of space, tab, newline  [Demo of emacs, sed, grep…]

Regular expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}

Regular expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}

Regular expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {} Invariant: Every machine must have exactly one final state.

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a} a

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}  R+S is a regular expression if R and S are

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}  R+S is a regular expression if R and S are  L R+S = L R U L S

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}  R+S is a regular expression if R and S are  L R+S = L R U L S R S

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}  R+S is a regular expression if R and S are  L R+S = L R U L S Invariant: Every machine must have exactly one final state. R S

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}  R+S is a regular expression if R and S are  L R+S = L R U L S Add a new final state with  transitions from old final states if necessary Invariant: Every machine must have exactly one final state. R S

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression  L = {}   is a regular expression  L = {}  a is a regular expression  L = {a}  R+S is a regular expression if R and S are  L R+S = L R U L S     R S

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression   is a regular expression  a is a regular expression  R+S is a regular expression if R and S are  RS is a regular expression if R and S are  L RS = {uv | u  L R & v  L S }  R S

Regular Expressions  Inductive definition. Let  = {a,b}.   is a regular expression   is a regular expression  a is a regular expression  R+S is a regular expression if R and S are  RS is a regular expression if R and S are  R* is a regular expression if R is  L R* = U 0 i L R i  R  

Regular Expressions  The language described by a regular expression can be accepted by an FSM. RE  NFA  NFA  DFA  A regular language can always be described using a regular expression. DFA  RE

Regular Expressions  Membership in a regular language can be tested in time linear in the size of the input string.

Building FSMs  An FSM is a directed graph  How large is the input alphabet?  How many states?  How fast must it run?  How to get the lowest constant factor?  How to minimize space?  Representations  Matrix  Array of lists  Hashtable  Overlapping hashtable  Switch statement ab

Manipulating FSMs  Eliminate unreachable states  Transform NFA into DFA  Transform NFA into NFA  Minimize DFA  Create FSM from regular expression  Create regular expression from FSM  Test equivalence of FSMs  Test emptiness of FSM language