CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular.

Slides:



Advertisements
Similar presentations
CSC 361NFA vs. DFA1. CSC 361NFA vs. DFA2 NFAs vs. DFAs NFAs can be constructed from DFAs using transitions: Called NFA- Suppose M 1 accepts L 1, M 2 accepts.
Advertisements

FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Regular Expressions Highlights: –A regular expression is used to specify a language, and it does so precisely. –Regular expressions are very intuitive.
1 FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY (For next time: Read Chapter 1.3 of the book)
Fall 2006Costas Busch - RPI1 Regular Expressions.
Tutorial CSC3130 : Formal Languages and Automata Theory Tu Shikui ( ) SHB 905, Office hour: Thursday 2:30pm-3:30pm
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
Lecture 3: Closure Properties & Regular Expressions Jim Hook Tim Sheard Portland State University.
1 Single Final State for NFAs and DFAs. 2 Observation Any Finite Automaton (NFA or DFA) can be converted to an equivalent NFA with a single final state.
Lecture 7 Sept 22, 2011 Goals: closure properties regular expressions.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Nondeterminism.
CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
Fall 2006Costas Busch - RPI1 Non-Deterministic Finite Automata.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
1 Regular Languages Finite Automata eg. Supermarket automatic door: exit or entrance.
Fall 2004COMP 3351 Another NFA Example. Fall 2004COMP 3352 Language accepted (redundant state)
Costas Busch - LSU1 Non-Deterministic Finite Automata.
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
1 A Single Final State for Finite Accepters. 2 Observation Any Finite Accepter (NFA or DFA) can be converted to an equivalent NFA with a single final.
Fall 2004COMP 3351 Regular Expressions. Fall 2004COMP 3352 Regular Expressions Regular expressions describe regular languages Example: describes the language.
Formal Language Finite set of alphabets Σ: e.g., {0, 1}, {a, b, c}, { ‘{‘, ‘}’ } Language L is a subset of strings on Σ, e.g., {00, 110, 01} a finite language,
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong DFA to regular.
1 Unit 1: Automata Theory and Formal Languages Readings 1, 2.2, 2.3.
Theory of Computation, Feodor F. Dragan, Kent State University 1 Regular expressions: definition An algebraic equivalent to finite automata. We can build.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Closure.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong NFA to DFA.
Regular Expressions Hopcroft, Motawi, Ullman, Chap 3.
Prof. Busch - LSU1 NFAs accept the Regular Languages.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
CHAPTER 1 Regular Languages
Lecture # 12. Nondeterministic Finite Automaton (NFA) Definition: An NFA is a TG with a unique start state and a property of having single letter as label.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
INHERENT LIMITATIONS OF COMPUTER PROGAMS CSci 4011.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong NFA to.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Recap: Transformation NFA  DFA  s s1s1... snsn p1p1 p2p2... pmpm >...  p1p1  p2p2  pipi s e s1s1 e s2s2 e sisi >
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Pushdown.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Conversions Regular Expression to FA FA to Regular Expression.
Regular Expressions CS 130: Theory of Computation HMU textbook, Chapter 3.
1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:
 2004 SDU Lecture4 Regular Expressions.  2004 SDU 2 Regular expressions A third way to view regular languages. Say that R is a regular expression if.
1 Introduction to the Theory of Computation Regular Expressions.
CSCI 2670 Introduction to Theory of Computing September 11, 2007.
Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.
Complexity and Computability Theory I Lecture #5 Rina Zviel-Girshin Leah Epstein Winter
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
1/29/02CSE460 - MSU1 Nondeterminism-NFA Section 4.1 of Martin Textbook CSE460 – Computability & Formal Language Theory Comp. Science & Engineering Michigan.
Chapter 2 Regular Languages & Finite Automata. Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈ 
Nondeterminism The Chinese University of Hong Kong Fall 2011
Languages.
Non Deterministic Automata
Complexity and Computability Theory I
Lecture 9 Theory of AUTOMATA
REGULAR LANGUAGES AND REGULAR GRAMMARS
Non-deterministic Finite Automata (NFA)
More on DFA minimization and DFA equivalence
Non-Deterministic Finite Automata
NFAs, DFAs, and regular expressions
NFA to DFA conversion and regular expressions
Chapter 1 Regular Language
CSCI 2670 Introduction to Theory of Computing
CSC312 Automata Theory Kleene’s Theorem Lecture # 12
Nondeterminism The Chinese University of Hong Kong Fall 2010
Presentation transcript:

CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular expressions Fall 2008

Operations on strings Given two strings s = a 1 …a n and t = b 1 …b m, we define their concatenation st = a 1 …a n b 1 …b m We define s n as the concatenation ss…s n times s = abb, t = cbast = abbcba s = 011s 3 =

Operations on languages The concatenation of languages L 1 and L 2 is Similarly, we write L n for LL…L ( n times) The union of languages L 1  L 2 is the set of all strings that are in L 1 or in L 2 Example: L 1 = {01, 0}, L 2 = { , 1, 11, 111, …}. What is L 1 L 2 and L 1  L 2 ? L 1 L 2 = {st: s  L 1, t  L 2 }

Operations on languages The star (Kleene closure) of L are all strings made up of zero or more chunks from L : –This is always infinite, and always contains  Example: L 1 = {01, 0}, L 2 = { , 1, 11, 111, …}. What is L 1 * and L 2 * ? L * = L 0  L 1  L 2  …

Constructing languages with operations Let’s fix an alphabet, say  = {0, 1} We can construct languages by starting with simple ones, like {0}, {1} and combining them {0}({0}  {1})* all strings that start with 0 ({0}{1}*)  ({1}{0}*) 0(0+1)* 01*+10*

Regular expressions A regular expression over  is an expression formed using the following rules: –The symbol  is a regular expression –The symbol  is a regular expression –For every a  , the symbol a is a regular expression –If R and S are regular expressions, so are RS, R+S and R*. Definition of regular language A language is regular if it is represented by a regular expression

Examples 1.01* = {0, 01, 011, 0111, …..} 2.(01*)(01) = {001, 0101, 01101, , …..} 3.(0+1)* 4.(0+1)*01(0+1)* 5.((0+1)(0+1)+(0+1)(0+1)(0+1))* 6.((0+1)(0+1))*+((0+1)(0+1)(0+1))* 7.( )*(  +0+00)

Examples Construct a RE over  = {0,1} that represents –All strings that have two consecutive 0 s. –All strings except those with two consecutive 0 s. –All strings with an even number of 0 s. (0+1)*00(0+1)* (1*01)*1* + (1*01)*1*0 (1*01*01*)*

Main theorem for regular languages Theorem A language is regular if and only if it is the language of some DFA DFA NFA regular expression regular languages

Proof plan For every regular expression, we have to give a DFA for the same language For every DFA, we give a regular expression for the same language  NFA regular expression NFADFA

What is an  NFA? An  NFA is an extension of NFA where some transitions can be labeled by  –Formally, the transition function of an  NFA is a function  : Q × (  {  }) → subsets of Q The automaton is allowed to follow  -transitions without consuming an input symbol

Example of  NFA q0q0 q1q1 q2q2 ,b a a   = {a, b} Which of the following is accepted by this  NFA: –aab, bab, ab, bb, a, 

M2M2 Examples: regular expression →  NFA R 1 = 0 R 2 = R 3 = (0 + 1)* q0q0 q1q1 0 q0q0 q1q1    q2q2 q3q3 0 q4q4 q5q5 1 q’ 0 q’ 1  M2M2   

General method regular expr  NFA  q0q0  q0q0 symbol a q0q0 q1q1 a RS q0q0 q1q1  MRMR MSMS 

Convention When we draw a box around an  NFA: –The arrow going in points to the start state –The arrow going out represents all transitions going out of accepting states –None of the states inside the box is accepting –The labels of the states inside the box are distinct from all other states in the diagram

General method continued regular expr  NFA R + S q0q0 q1q1  MRMR MSMS   R*R* q0q0 q1q1  MRMR   

Road map  NFA regular expression NFA DFA

Example of  NFA to NFA conversion q0q0 q1q1 q2q2 ,b a a   NFA: Transition table of corresponding NFA: states inputs a b q0q0 q1q1 q2q2 {q 1, q 2 } {q 0, q 1, q 2 }    Accepting states of NFA: {q 0, q 1, q 2 }

Example of  NFA to NFA conversion q0q0 q1q1 q2q2 ,b a a   NFA: NFA: q0q0 q1q1 q2q2 a, b a a a a

General method To convert an  NFA to an NFA: –States stay the same –Start state stays the same –The NFA has a transition from q i to q j labeled a iff the  NFA has a path from q i to q j that contains one transition labeled a and all other transitions labeled  –The accepting states of the NFA are all states that can reach some accepting state of  NFA using only  -transitions

Why the conversion works In the original  -NFA, when given input a 1 a 2 …a n the automaton goes through a sequence of states: q 0  q 1  q 2  …  q m Some  -transitions may be in the sequence: q 0 ...  q i 1 ...  q i 2  …  q i n In the new NFA, each sequence of states of the form: q i k ...  q i k+1 will be represented by a single transition q i k  q i k+1 because of the way we construct the NFA.  a1a1 a2a2  a k+1

Proof that the conversion works More formally, we have the following invariant for any k ≥ 1 : We prove this by induction on k When k = 0, the  NFA can be in more states, while the NFA must be in q 0 After reading k input symbols, the set of states that the  NFA and NFA can be in are exactly the same

Proof that the conversion works When k ≥ 1 (input is not the empty string) –If  NFA is in an accepting state, so is NFA –Conversely, if NFA is an accepting state q i, then some accepting state of  NFA is reachable from q i, so  NFA accepts also When k = 0 (input is the empty string) –The  NFA accepts iff one of its accepting states is reachable from q 0 –This is true iff q 0 is an accepting state of the NFA

From DFA to regular expressions  NFA regular expression NFA DFA

Example Construct a regular expression for this DFA: q1q1 q2q2 (0 + 1)*0 + 

General method We have a DFA M with states q 1, q 2,… q n We will inductively define regular expressions R ij k R ij k will be the set of all strings that take M from q i to q j with intermediate states going through q 1, q 2,… or q k only.

Example q1q1 q2q2 R 11 0 = { , 0} =  + 0 R 12 0 = {1} = 1 R 22 0 = { , 1} =  + 1 R 11 1 = { , 0, 00, 000,...}= 0* R 12 1 = {1, 01, 001, 0001,...}= 0*1

General construction We inductively define R ij k as: R ii 0 = a i 1 + a i 2 + … + a i t +  (all loops around q i and  ) (all q i → q j ) R ij k = R ij k-1 + R ik k-1 (R kk k-1 )*R kj k-1 a path in M qiqi qkqk qjqj R ij 0 = a i 1 + a i 2 + … + a i t if i ≠ j a i 1,a i 2,…,a i t qiqi qiqi qjqj (for k > 0 )

Informal proof of correctness Each execution of the DFA using states q 1, q 2,… q k will look like this: q i → … → q k → … → q k → … → q k → … → q j intermediate parts use only states q 1, q 2,… q k-1 R ik k-1 (R kk k-1 )* R kj k-1 R ij k-1 + state q k is never visited or

Final step Suppose the DFA start state is q 1, and the accepting states are F = {q j 1  q j 2 …  q j t } Then the regular expression for this DFA is R 1j 1 n + R 1j 2 n + ….. + R 1j t n

All models are equivalent  NFA regular expression NFA DFA A language is regular iff it is accepted by a DFA, NFA,  NFA, or regular expression

Example Give a RE for the following DFA using this method: q0q0 q1q1