CS 203: Introduction to Formal Languages and Automata

Slides:



Advertisements
Similar presentations
Theory Of Automata By Dr. MM Alam
Advertisements

Properties of Regular Languages
Lecture 9,10 Theory of AUTOMATA
Fall 2006Costas Busch - RPI1 Non-regular languages (Pumping Lemma)
CSCI 2670 Introduction to Theory of Computing September 13, 2005.
CS21 Decidability and Tractability
Finite Automata Great Theoretical Ideas In Computer Science Anupam Gupta Danny Sleator CS Fall 2010 Lecture 20Oct 28, 2010Carnegie Mellon University.
Nonregular languages Sipser 1.4 (pages 77-82). CS 311 Mount Holyoke College 2 Nonregular languages? We now know: –Regular languages may be specified either.
Nonregular languages Sipser 1.4 (pages 77-82). CS 311 Fall Nonregular languages? We now know: –Regular languages may be specified either by regular.
CS 310 – Fall 2006 Pacific University CS310 Pumping Lemma Sections:1.4 page 77 September 27, 2006.
1 More Properties of Regular Languages. 2 We have proven Regular languages are closed under: Union Concatenation Star operation Reverse.
Regular Languages Sequential Machine Theory Prof. K. J. Hintz Department of Electrical and Computer Engineering Lecture 3 Comments, additions and modifications.
Costas Busch - RPI1 Standard Representations of Regular Languages Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
CS5371 Theory of Computation Lecture 5: Automata Theory III (Non-regular Language, Pumping Lemma, Regular Expression)
CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)
Courtesy Costas Busch - RPI1 Non-regular languages.
Fall 2004COMP 3351 Standard Representations of Regular Languages Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
FSA Lecture 1 Finite State Machines. Creating a Automaton  Given a language L over an alphabet , design a deterministic finite automaton (DFA) M such.
1 Regular Languages Finite Automata eg. Supermarket automatic door: exit or entrance.
Prof. Busch - LSU1 Non-regular languages (Pumping Lemma)
FORMAL LANGUAGES, AUTOMATA AND COMPUTABILITY
Great Theoretical Ideas in Computer Science.
Extra on Regular Languages and Non-Regular Languages
Properties of Regular Languages
CSE 3813 Introduction to Formal Languages and Automata Chapter 8 Properties of Context-free Languages These class notes are based on material from our.
1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 5 School of Innovation, Design and Engineering Mälardalen University 2012.
Introduction to CS Theory Lecture 3 – Regular Languages Piotr Faliszewski
1 Chapter Regular Language Topics. 2 Section 11.4 Regular Language Topics Regular languages are also characterized by special grammars called regular.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Closure.
Solution Exercise 1.43 a A r r s q q > b b e b s’ q r q’ b r’ a A’
TK PrasadPumping Lemma1 Nonregularity Proofs. TK PrasadPumping Lemma2 Grand Unification Regular Languages: Grand Unification (Parallel Simulation) (Rabin.
CHAPTER 1 Regular Languages
CS 3240 – Chapter 4.  Closure Properties  Algorithms for Elementary Questions:  Is a given word, w, in L?  Is L empty, finite or infinite?  Are L.
Class Discussion Can you draw a DFA that accepts the language {a k b k | k = 0,1,2,…} over the alphabet  ={a,b}?
CS 3813: Introduction to Formal Languages and Automata Chapter 2 Deterministic finite automata These class notes are based on material from our textbook,
Properties of Regular Languages
Chapter 6 Properties of Regular Languages. 2 Regular Sets and Languages  Claim(1). The family of languages accepted by FSAs consists of precisely the.
CS355 - Theory of Computation Regular Expressions.
1 Closure Properties of Regular Languages L 1 and L 2 are regular. How about L 1  L 2, L 1  L 2, L 1 L 2, L 1, L 1 * ?
CSCI 2670 Introduction to Theory of Computing September 13.
Chapter 3 Regular Expressions, Nondeterminism, and Kleene’s Theorem Copyright © 2011 The McGraw-Hill Companies, Inc. Permission required for reproduction.
CS 208: Computing Theory Assoc. Prof. Dr. Brahim Hnich Faculty of Computer Sciences Izmir University of Economics.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 1 Regular Languages Some slides are in courtesy.
Chapter 8 Properties of Context-free Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata,
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
CSCI 4325 / 6339 Theory of Computation Zhixiang Chen.
Equivalence with FA * Any Regex can be converted to FA and vice versa, because: * Regex and FA are equivalent in their descriptive power ** Regular language.
Regular Languages Chapter 1 Giorgi Japaridze Theory of Computability.
MA/CSSE 474 Theory of Computation How many regular/non-regular languages are there? Closure properties of Regular Languages (if there is time) Pumping.
1 Advanced Theory of Computation Finite Automata with output Pumping Lemma Theorem.
Chapter 1 INTRODUCTION TO THE THEORY OF COMPUTATION.
Nonregular Languages Section 2.4 Wed, Oct 5, 2005.
CSE 105 theory of computation
Pumping Lemma.
Standard Representations of Regular Languages
PROPERTIES OF REGULAR LANGUAGES
CSE 3813 Introduction to Formal Languages and Automata
Regular Expression We shall build expressions from the symbols using simple operations include concatenation, union and kleen closure. Several intuitive.
4. Properties of Regular Languages
Deterministic PDAs - DPDAs
Elementary Questions about Regular Languages
FORMAL LANGUAGES, AUTOMATA, AND COMPUTABILITY
Pumping Lemma.
Chapter 4 Properties of Regular Languages
CS21 Decidability and Tractability
Recap lecture 25 Intersection of two regular languages is regular, examples, non regular languages, example.
CHAPTER 1 Regular Languages
Presentation transcript:

CS 203: Introduction to Formal Languages and Automata Chapter 4 Properties of Regular Languages These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata, by Peter Linz, published by Jones and Bartlett Publishers, Inc., Sudbury, MA.

Properties of regular languages What happens when we perform operations on regular languages? E.g., if we concatenate two regular languages, is the resulting language also regular? Can we decide whether a given language has a certain property or not? E.g., Can we tell if a certain language is finite or not? Can we tell whether a given language is regular or not?

Closure properties of regular languages Definition: A regular language is any language that is accepted by a finite automaton Theorem 4.1 : The class of regular languages is closed under the following operations (that is, performing these operations on regular languages creates other regular languages) Union Concatenation Kleene star Complement Intersection

Closure for union, concatenation, and Kleene star If L1 and L2 are regular languages, then there exist regular expressions r1 and r2 such that L1 = L(r1) and L2 = L(r2). By definition 3.1.2 in our text: r1+r2 , r1r2, and r* are regular expressions, and: L1  L2 = L(r1+r2) L1L2 = L(r1r2) L1* = L(r*)

Closure for union, concatenation, and Kleene star Since languages represented by regular expressions are by definition regular, performing the operations of union, concatenation, and star-closure on regular languages produces regular languages. We say that the class of regular languages is closed under union, concatenation, and Kleene star (star-closure).

So: The null language  is regular The language consisting of the empty string, {λ}, is regular For each a in , {a} is regular If L1 and L2 are regular: L1  L2 is regular L1L2 is regular L1* is regular

Unions, Intersections, and Complements: Theorem 4.1, p. 100 Suppose that M1 = (Q1, , 1, q1, F1) accepts language L1, and M2 = (Q2, , 2, q2, F2) accepts language L2 Let M be an FA defined by M = (Q, , , q0, F) where Q = Q1  Q2 q0 = (q1, q2) and the transition function  is defined by: ((p, q), a) = (1(p, a), 2(q, a)), for any p  Q1, q  Q2, and a  

Unions, Intersections, and Difference: Theorem 4.1, p. 100 Then: If F = {(p, q)  p  F1 or q  F2}, M accepts the language L1  L2 If F = {(p, q)  p  F1 and q  F2}, M accepts the language L1 L2 If F = {(p, q)  p  F1 and q  F2}, M accepts the language L1  L2

Theorem 4.1, p. 100 Proof: For any x   and any (p, q)  Q: *((p, q), x) = (1*(p, x), 2*(q, x)) A string x is accepted by M iff *((q1, q2), x)  F By our formula, this is true only if (1*(q1, x), 2*(q2, x))  F

Theorem 4.1, p. 100 Proof (continued): For Case 1, this is equivalent to saying that: 1*(q1, x)  A1 or 2*(q2, x)  A2 Which is equivalent to x  L1  L2 Cases 2 and 3 are similar

Complement Consider the special case in which L1 is all of *. Here, L1 – L2 is actually L2’ (the complement of L2)

Reversal Theorem 4.2: The family of regular languages is closed under reversal. Proof: If L is a regular language, construct an NFA with a single final state that accepts it. Now change the initial vertex into a final vertex, the final vertex into the initial vertex, and reverse the direction on all the edges. For every string w accepted by the original NFA, the modified version of the NFA accepts wR.

Homomorphism Definition 4.1: A homomorphism is a substitution in which a single letter is replaced with a string. Formally, if  and  are alphabets, then a function h :   * is a homomorphism. If L is a language on S, then its homomorphic image is: h(L) = {h(w) : w  L}

Homomorphism Theorem 4.3: If L is a regular langugae, then its homomorphic image h(L) is also regular. Thus the family of regular languages is closed under homomorphism.

Right quotient To form the right quotient of L1 with L2, L1/L2, take all strings in L1 that have a suffix belonging to L2 and remove the suffix. Example: L1 = {ab, aab, aaab, aaaab} L2 = {b} L1/L2 = {a, aa, aaa, aaaa}

Right quotient Theorem 4.4: If L1 and L2 are regular languages, then L1/L2 is also regular. Thus the family of regular languages is closed under right quotient with another regular language. Proof: By construction – see textbook, pp. 106-107.

The membership question Given a language L and a string w, is w  L? A method for answering the membership question is called a membership algorithm. Is there a membership algorithm for regular languages?

The membership question Theorem 4.5: Given a standard representation (i.e., a finite automaton, a regular expression, or a regular grammar) of any regular language L on  and w  *, there exists an algorithm for determining whether w is in L. Proof: Here is the algorithm: If the standard representation of L is in the form of a regular expression, or a regular grammar, construct an equivalent FA. Test w to see if it is accepted by the FA.

The finiteness question Theorem 4.6: Given a standard representation (i.e., a finite automaton, a regular expression, or a regular grammar) of any regular language L on , there exists an algorithm for determining whether L is empty, finite, or infinite.

The finiteness question Proof: Here is the algorithm: If the standard representation of L is in the form of a regular expression, or a regular grammar, construct an equivalent FA. If there is a simple path from the initial vertex to any final vertex, then the language is not empty. Find all the vertices that are the base of a cycle. If any of these vertices is on a path from the initial to a final vertex, the language is infinite; otherwise, it is finite.

The “does L1 = L2” question Theorem 4.7: Given standard representations of two regular languages L1 and L2, there exists an algorithm for determining whether or not L1 = L2.

The “does L1 = L2” question Proof: Here is the algorithm: Define a new language L3 = (L1  ~L2)  (~L1  L2) L3 is regular (see previous closure proofs) Therefore, we can find a DFA that accepts L3. Use theorem 4.6 to decide if L3 is empty. L3 =  iff L1 = L2 (exercise 8 in section 1.1 in the Linz textbook). So L1 = L2 if L3 = ; otherwise, L1  L2

The pigeonhole principle The “pigeonhole principle” states that if n + 1 items are placed into n pigeonholes, then at least 1 pigeonhole must end up with more than 1 item in it. In set notation: if f : A  B |A| = n + 1 |B| = n then f cannot be one-to-one

Not all formal languages are regular An automaton that accepts the language L = {akbk | k 0} must count the number of a’s in each string to make sure there is an identical number of b’s. There is no limit on how high the automaton might need to count to accept a string in this language. But an automaton with finite memory can only count as high as the size of its memory. This is an intuitive argument why this language is not regular. It is not a proof, however. To prove that a language is not regular, we use a mathematical result called the “pumping lemma for regular languages.”

4.3: The Pumping Lemma The Pumping Lemma is used to prove that a language is not regular How do we prove that a language L is regular? Write a regular expression for it Draw a Finite Automaton for it Construct a regular grammar for it

Pumping Lemma Theorem 4.8: Let L be a regular language. There exists a positive integer m such that for any string w  L with |w|  m, w may be written as w = xyz, for some x, y, and z satisfying the following: |xy|  m, |y|  1, and xyiz  L for every i  0

Pumping Lemma In other words, every sufficiently long string in L can be broken down into three parts in such a way that an arbitrary number of repetitions of the middle part yields another string in L. We say that the middle string is “pumped,” hence the term pumping lemma.

Based on the idea of loops Given: M = (Q, S , d ,q0,A), where |Q| = n, and any string x where |x|  n , then x must pass through a sequence of n + 1 states. Suppose x = a1 a2 a3 ... an y. Then the sequence of n+1 states q0= d*(q0, l) q1= d*(q0, a1) q2= d*(q0, a1 a2) qn= d*(q0, a1 ...an) must contain some state at least twice, by the pigeonhole principle.

Example x = a |x| = 1 Sequence of states = q0 q1 b a q0 q1 x = a |x| = 1 Sequence of states = q0 q1 n = Number of different states passed through = 2

Example x = bba so |x| = 3 Sequence of states = q0 q0 q1 n = 2 Any string where |x|  n must have repeated a state

Pumping If a state is repeated one or more times, it means that there must be a loop in the transition diagram. If there is a loop, then it can be “pumped” to produce additional strings that belong to the language

Example If ba is in the language, and there are only 2 states in the automaton, then a, bba, bbba, bbbba, etc. are also in the language. b a q0 q1

Example of a nonregular language L = 0i1i | i  0 Is this regular? No. Why not? Intuitively: We can’t build a finite automaton to recognize it.

Example of a nonregular language L = 0i1i | i  0 Because the FA has no memory for past events except its states. Each state can tell you how you got to that state from the immediately previous state (i.e., the last character you processed), but, if there is a loop, it can’t remember the number of characters you processed up to that point.

Limits of a FA Being in state q1 and having just read a 1 doesn’t tell you anything about how many 1’s have already been processed. The FA simply doesn’t have the memory needed to retain this information. 1 l q0 q1

Limits of a FA Moreover, if you have a loop like this in an FA, the FA must accept any number of 1’s in the loop. There is no way to specify “exactly as many 1’s as 0’s” – this FA can accept 000111, but must also accept 0111, 00001, etc. 1 l q0 q1

Limits of an FA Consequently, we can’t build an FA that can tell whether the number of 0’s that it saw at the beginning of the string exactly matches the number of 1’s at the end of the string. But this is not a formal proof.

Proof idea If a DFA has n states, then any path of length n must visit n+1 states, and contains a cycle. (This is an application of the “pigeonhole principle.”) y z x This part of the string can be “pumped” to produce other strings in the language.

Proof idea again If an infinite language is regular, it is accepted by a DFA. The DFA has some finite number of states, say, m. Because the language is infinite, some strings must have length > m. For a string of length > m accepted by the DFA, a “walk” through the DFA must contain a cycle. Repeating the cycle an arbitrary number of times must yield another string accepted by the DFA.

Proof Suppose that qi = qi+p , where 0  i < i + p  n x = uvw u = a1a2…ai v = ai+1a2…ai+p w = ai+p+1ai+p+2…any y = part of string longer than n + 1 Remember that qi = qi+p

Proof Assume a dfa with states labeled q0,q1,…qn Now take as string in L |w|  m = n +1 To process w the machine could go through a set of states say, q0, qi, qj, … qf. Since this sequence has exactly |w| +1 entries, at least one state must be repeated, and this repetition starts no later than the nth move. So the sequence of states must look like q0, qi, qj, …, qr, …qr, …, qf indicating there must be substrings x, y, z of w such that d*(q0, x) = qr d*(qr, y) = qr d*(qr, z) = qf with |xy|  n +1 = m and |y|  1

Proof (cont.) From this it immediately follows that d*(q0, xz) = qf as well as d*(q0, xy2z) = qf, d*(q0, xy3z) = qf, and so on, completing the proof of the theorem

How to use the pumping lemma The Pumping Lemma describes a property that is possessed by every regular language. If we show that a language does not possess this property, we know that it is not regular. The strategy is proof by contradiction. We assume a language has the property described by the pumping lemma, and then we show that this leads to a contradiction. It follows that the language is not regular.

Example Example 4.7: The language L = {anbn | n  0} is not regular. The proof is by contradiction: If L is regular, it must be accepted by some DFA. Let m be the number of states of the DFA and consider some w  L such that |w|  m. By the pumping lemma, we can split w into three pieces, w = xyz, such that for any n  0, the string xynz is in L. So let w = ambm. Because |xy|  m, y must consist of all a’s. But then xy2z will contain more a’s than b’s. This is a contradiction.

Homework Use the pumping lemma to show that the language of palindromes L = {w | w = wR, w  {a,b}*} is not regular.

Homework Use the pumping lemma (plus some closure properties of regular languages) to show that the language L = {w  {a,b}* | w contains equal number of a’s and b’s} is not regular.

Homework Use the pumping lemma to show that the language L = {ww | w  {a,b}*} is not regular.