Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 Regular Languages & Finite Automata. Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈ 

Similar presentations


Presentation on theme: "Chapter 2 Regular Languages & Finite Automata. Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈ "— Presentation transcript:

1 Chapter 2 Regular Languages & Finite Automata

2 Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈  r + sL r ∪ L s r and s are regular rsL r · L s r and s are regular r*L r *r is regular L((a + b)*a) = {xa : x ∈ {a, b}*} = {w : w ends in a} L(0* + 0*(1 + 11)[00*(1 + 11)]*0*) = {w : 111  w}

3 Example: r = (0 + 10)*(1 + 10)* Claim: L(r) = {w : every pair of adjacent zeros appears before any pair of adjacent ones} Justification: w ∈ L(r) implies w = w ₁ w ₂ with w ₁ ∈ (0 + 10)* and w ₂ ∈ (1 + 10)*. Since (0 + 10)* cannot have double ones and (1 + 10)* cannot have double zeros we can’t have a 11 before a 00. So every double zero appears before any double one.

4 Continued: r = (0 + 10)*(1 + 10)* Conversely any string w ≠ …11…00…, i.e. w = …00…00…11…11… with the property can be written w = xyz where x is the shortest prefix containing all the double zeros (i.e. ε or ending in 00) and z is the shortest suffix containing all the double ones (i.e. ε or beginning with 11). This is possible precisely because w satisfies the requirement of all double zeros before any double ones. Now see that y must be of the form (10)*. Since x ∈ (0 + 10)* and z ∈ (1 + 10)*, w = xyz ∈ (0 + 10)*(10)*(1 + 10)*) = L(r), since (10)* is subsumed in its neighbors.

5 Regular Expressions identities Facts: Ø is the additive identity: Ø·r = Ø = r·ØØ + r = r = r + Ø Ø* is the multiplicative identity: Ø*·r = r·Ø* = r r + r = r r*r* = r*(r*)* = r*r(st) = (rs)t r + s = s + r(r + s) + t = r + (s + t) r(s + t) = rs + rt (r + s)t = rt + st (r*s*)* = (r + s)* [HW problem] Regular operations are monotone (e.g. ‘*’) r ⊆ s ⇒ f(r) ⊆ f(s)

6 Disjunctive Normal Form (DNF) Theorem: Every regular expression can be written as r 1 +…+ r n where each r i does not containing the ‘+’ symbol. Proof: By structural induction. The bases are trivial, as well as r + s. r · s = (r 1 + … + r n )·(s 1 + … + s m ) by induction hypothesis. Now use distributive law to finish. r* = (r 1 + … + r n )* by IH = (r 1 * ····· r n *)* which follows from the identity (a + b)* = (a*b*)* by induction.

7 Deterministic Finite Automata: DFA Formal Syntax s ∈ Q, F ⊆ Q, δ : Q  Σ → Q current state next state  Q, Σ,δ,s, F  finite set of states alphabet transition function start statefinal states M = input tapestring from Σ read head finite control

8 DFA Example: parity Q = {q 0, q 1 }Σ = {0, 1}s = q 0 F = {q 0 } L(M) = {w ∈ {0, 1}* : w has an even number of one’s} E.g. let w = 0110 ∈ L(M) (q 0, 0110)  M (q 0, 110)  M (q 1, 10)  M (q 0, 0)  M (q 0, ε) qσδ q0q0 0q0q0 q0q0 1q1q1 q1q1 0q1q1 q1q1 1q0q0 q0q0 q1q1 0 0 1 1

9 Algorithm for DFA q : = s{M begins in state s} h : = 1{with head leftmost} while σ(h) <> blank{as long as head is reading a symbol} q : = δ(q, σ(h)){change state} h : = h + 1{move head right one symbol} accept : = q in F{accept if we end in a final state} Formal definitions for semantics: A configuration of M is an element of Q  Σ*, (q, w), where w hasn’t been read yet. (q, σw)  M (δ(q, σ), w) is the yields function  M : Q  Σ + → Q  Σ* M accepts w  (s, w)  M * (f, ε) for some f ∈ F

10 DFA Example #2 L(M) = {w : w is a sequence of pairs ab or ba} q0q0 q3q3 a, b b a q2q2 q1q1 a b b a

11 More DFA Examples Example of using minus L M = 1*0(0 + 1)* = Σ* − 1* 0 0, 1 1 A C 1 0 0 B 1 0 1 A simplified finite automaton recognizing (0 + 1)*10 sInput t01 aAAB tBCB eCAB M

12 DFA closure properties Let M ₁ = (Q ₁, Σ, s ₁, A ₁, δ ₁ ) and M ₂ = (Q ₂, Σ, s ₂, A ₂, δ ₂ ) accept the languages L ₁ and L ₂ respectively. Let M = (Q ₁  Q ₂, Σ, (s ₁, s ₂ ), A, (δ ₁ (p ₁, a), δ ₂ (p ₂, a))) for any p ₁ ∈ Q ₁, p ₂ ∈ Q ₂, and a ∈ Σ. Claim: 1. If A = {(q ₁, q ₂ ) : q ₁ ∈ A ₁ or q ₂ ∈ A ₂ }, M accepts L ₁ ∪ L ₂. 2. If A = {(q ₁, q ₂ ) : q ₁ ∈ A ₁ & q ₂ ∈ A ₂ }, M accepts L ₁ ∩ L ₂. 3. If A = {(q ₁, q ₂ ) : q ₁ ∈ A ₁ & q ₂ ∉ A ₂ }, M accepts L ₁ − L ₂.

13 Nondeterministic Finite Automata: NFA Same as DFA except :Δ: Q  Σ → 2 Q i.e. iff q ∈ Δ(p, σ) Yields relation is no longer a function: (q, σx)  M (q′, x) if q′ ∈ Δ(q, σ)  M * is same as before (meaning: all valid paths) Acceptance: M accepts w   f ∈ F such that (s, w)  M * (f, ε) pq σ

14 NFA Example Search for x s f x Σ Σ q0q0 qnqn σ1σ1 Σ q1q1 Σ … σnσn x = σ 1 … σ n

15 NFA to DFA Theorem Definitions:δ(p, ε) ≡ p δ(P, σ) ≡ {δ(p, σ) : p ∈ P} δ(P, wσ) = δ(δ(P, w), σ) Δ(P, σ) ≡ ⋃ {Δ(p, σ) : p ∈ P} Δ(p, ε) ≡ {p} Δ(P, wσ) = Δ(Δ(P, w), σ) Theorem: For every NFA M =  Q, Σ, s, F, Δ , there is an equivalent DFA. Proof: Define the DFA M′ =  2 Q, Σ, {s}, {P ⊆ Q: P ∩ F ≠ Ø}, δ(P ⊆ Q, σ) = Δ(P, σ)  Idea: single state in M′ is a set of states in M. Show M can reach f ∈ F iff M′ reaches a state containing f.

16 NFA to DFA Example M =  {q 0, q 1 }, {0, 1}, Δ, q 0, {q 1 }  L M = {w ∈ {0, 1}* : w doesn’t begin with 10} Δ01 q0q0 {q 0, q 1 }{q1}{q1} q1q1 Ø q₀q₀ q1q1 0 11 0 1 try 110 M′ =  {Ø, {q 0 }, {q 1 }, {q 0, q 1 }}, {0, 1}, δ, {q 0 }, {{q ₀ }, {q 1 }, {q 0, q 1 }}  Ø {q 0, q 1 } 0,1 1 0 1 {q1}{q1} {q₀}{q₀} 0 δ01 ØØØ {q0}{q0}{q 0, q 1 }{q1}{q1} {q1}{q1}Ø

17 NFA to DFA Proof Do by showing that Δ(s, w) = δ({s}, w) by induction on |w| Basis: |w| = 0 ⇒ w = ε ⇒ Δ(s, ε) = {s} = δ({s}, ε) by definition. Induction: Take wσ. δ({s}, wσ) ≡ δ(δ({s}, w), σ), and Δ(s, wσ) ≡ Δ(Δ(s, w), σ) call this P But δ(P, σ) = Δ(P, σ) by definition! δ({s}, w) = Δ(s, w) IH s p ∈ Pp ∈ Pr ∈ Rr ∈ R σ NFA : w {s}{s} P R σ DFA : w  IH  by construction

18 Another NFA to DFA Example L = {w ∈ {a, b }* : bb ⊆ w} s q b p a,ba,b b a,ba,b NFA Δab s{s}{s}{s, p} pØ{q}{q} q{q}{q}{q}{q} δab {s}{s}{s}{s} {s}{s}{s, p, q} {s, q}{s, p, q} {s, q} {s, p, q} {s}{s} b {s, p} b b a DFA {s, q} a b a a

19 NFA with ε-moves Extend domain of Δ ⊆ Q  (Σ ∪ {ε})  Q so that machine can change state without consuming any input: pq ε Example: q0q0 q2q2 ε q1q1 ε 012 Δ012ε q0q0 {q0}{q0}ØØ{q1}{q1} q1q1 Ø{q1}{q1}Ø{q2}{q2} q2q2 ØØ{q2}{q2}Ø

20 Removing ε-moves Let Δ ε * = Δ* ∩ [Q  {ε}  Q], the transitive reflexive closure of the ε-edges. So Δ ε *(p) = {q : (p, q) ∈ Δ ε *} Extend Δ to Δ′(p, σ) = Δ(Δ ε *(p), σ) Extend F to F′ = {p : Δ ε *(p) ∩ F ≠ Ø} Remove all ε-edges and claim that new machine is the same as the old. Idea: break paths in old machine into pq ε p ε q σ σ pf ε ε … ε … σnσn σiσi ε … σ1σ1

21 NFA with ε-moves, example continued Δ012 q0q0 {q0}{q0}{q1}{q1}{q2}{q2} q1q1 Ø{q1}{q1}{q2}{q2} q2q2 ØØ{q2}{q2} q0q0 q2q2 ε q1q1 ε 012 1 2 2

22 Regular Language to NFA Theorem:Let r be a regular expression. Then L r = L M for some NFA a ε ε MrMr MsMs Proof: Basis: r = Ø r = a L M = Ø L M = {a} Induction: r + s L r = L M r by IH L s = L M s L M = {ε}L M r ∪ {ε}L M s = L r ∪ L s = L r+s MrMr MsMs ε L M = L M r ∙ {ε} ∙ L M s L r ∙ L s = L r∙s r ∙ s MrMr ε ε r*r* L M = {ε} ∪ L M r + = L r * = L r*

23 Regular Expression for Parity (10*1 + 0)* M 0 = 0 1 0ε ε simplify by eliminating ε-transitions and identifying equivalent states 0 M 0* = M 1 = M 10*1 = 1εε 0 1 11 0 simplification

24 Example continued M (10*1+0) = 1 ε ε 0 1 0 use ε-closure and eliminate unreachable states and combine final states together 1 1 0 0 M (10*1+0)* = 1 ε 1 0 0 ε simplify using ε-closure to 0 0 1 1 since no transitions enter the start state, and since the start state and final state are equivalent

25 DFA → Regular Language (classical method) Theorem: Let M be a DFA. Then L(M) = L(r) for some regular expression r. Proof: Number the states Q = {s = q 1, …, q n } (no q 0 ). Let R ij k be the set of strings from Σ* which take M from state q i to state q j without passing through any state numbered higher than k. R ij 0 = R ij k = R ij k-1 + R ik k-1 ∙ (R kk k-1 )* ∙ R kj k-1 each R k−1 is regular by IH L(M) = ∪ {R 1j n : q j ∈ F } is a finite union of regular sets {a : δ(q i, a) = q j } i ≠ j {a : δ(q i, a) = q j } ∪ {ε} i = j qiqi qjqj a qiqi a qiqi qjqj qkqk ≤ k − 1

26 DFA → Regular Language Proof Claim: Each R ij k is regular. Proof by induction on k. Basis: R ij 0 = L(r ij 0 ) where r ij 0 = a 1 + … + a m + Ø* Induction Step: R ij k = L(r ij k ) where r ij k = r ij k-1 + r ik k-1 (r kk k-1 )* r kj k-1 L(M) is a finite union of regular sets, hence regular □ a i ∈ R ij 0 if ε ∈ R ij 0

27 DFA for Parity Using r ij k Method L(M) = r 11 2 = r 12 1 (r 22 1 )*r 21 1 + r 11 1 = 0*1(10*1 + 0)*10* + 0* q1q1 q2q2 0 1 10 k = 0k = 1 r 11 k 0 + ε(0 + ε)(0 + ε)*(0 + ε) + (0 + ε) = 0* r 12 k 1(0 + ε)0*1 + 1 = 0*1 r 21 k 110*(0 + ε) + 1 = 10* r 22 k 0 + ε10*1 + 0 + ε

28 FA → Regular Language Start: Number the states s = q 0, …, q n. Idea: find a solution to the problem A i = {w ∈ Σ* : Δ(q i, w) ∩ F ≠ Ø} when i = 0 Solve: mutually recursive equations A i = ∑ {σA j : q j ∈ Δ(q i, σ), σ ∈ Σ} + {ε : if q i ∈ F} Show: can be solved by a regular expression

29 Arden’s Lemma Lemma: The recursive equation X = AX + B, where A and B are languages, ε ∉ A, has a unique solution X = A*B. Proof: Obviously A(A*B) + B = (A⁺ + ε)B = A*B is a solution. Clearly B ⊆ X, ⇒ AB ⊆ X ⇒ … ⇒ A*B ⊆ X means it is minimal. If a larger solution L existed, then C = L \ A*B ≠ Ø. Then A*B + C = A(A*B + C) + B = A⁺B + AC + B = A*B + AC. Now, C is disjoint from A*B, so (A*B + C) ∩ C = (A*B + AC) ∩ C ⇒ C = AC ∩ C ⇒ C ⊆ AC. Let x ∈ AC be of minimal length. Then x = yz, y ∈ A, z ∈ C. But ε ∉ A by hypothesis ⇒ z ∈ AC with |z| < |x|, contradiction. Note: ε ∉ A is not a restriction because in any FA, an epsilon loop from any state to itself can be removed.

30 Arden’s Lemma Example A 1 = 1A 0 + 0A 1 = 0*1A 0 A 0 = ε + 0A 0 + 1A 1 A 0 = ε + 0A 0 + 10*1A 0 = ε + (0 + 10*1)A 0 = (0 + 10*1)*ε = (0 + 10*1)* q0q0 q1q1 0 1 1 0

31 Picture for Arden’s Lemma Solving Recursive Equations Note that after each phase A i = … A j<i …. In particular, A 0 is solved. C...... A2BA2B AB B A*BA*B AC........ A2BA2B AB B A*B A __ + B A n = A 0 ….. A n … A i+1 = A 0 ….. A i+1 A n = A 0 ….. A n−1 A i+1 = A 0 ….. A i … use Arden’s to eliminate A n A n−1 = A 0 ….. A n A n−1 = A 0 ….. A n−1 substitute for A n Arden repeated substitution A i+1, …, A n A i = A 0 ….. A n A i = A 0 ….. A i … x Ax + B

32 Pumping Lemma Theorem: Let L be an infinite regular language. Then there is an n such that for all w ∈ L with |w| ≥ n, w can be written as w = uvx with |v| ≥ 1 and |uv| ≤ n such that for all i ≥ 0, uv i x ∈ L. Proof: Let L = L M for some DFA M with n states. Running M on w ∈ L with |w| ≥ n means it visits ≥ n + 1 states, so some state appears twice (PHP), in which case uv i x ∈ L M for all i ≥ 0. Temporally: state appears twice on the path from start to final Spatially: we must pass through a loop on the diagram

33 Irregularity Method Take an infinite language L, assume it is regular toward a contradiction via the pumping lemma. I.e. show:  n,  w ∈ L, |w| ≥ n,   uvx = w, |uv| ≤ n, v ≠ ε  i ≥ 0  uv i x ∉ L Example: Suppose L = {a m b m : m ≥ 0} is regular. Given any n, take w = a n b n. Since w = uvx with |uv| ≤ n and |v| ≥ 1, v ∈ a +. Choose i = 0, to get uv 0 x = a n−|v| b n ∉ L. Contradiction Example: Suppose L = {0 i² : i ≥ 1} is regular. Take w = 0 n² = uvx, with 1 ≤ |v| ≤ n. So uv²x = 0 n²+|v|, but n 2 + |v| < (n + 1) 2 = n 2 + 2n + 1. So uv²x ∉ L. Contradiction to PL. Using closure properties (intersection) to show irregularity: Example: For L = {ww R : w ∈ {a, b}*}, let L' = L ∩ a*bba* = {a n b 2 a n : n ≥ 0} which is easy to show (by PL) irregular.

34 Pumping Lemma (explanation) Idea: If L is infinite and regular, it must satisfy: i.e. if L is infinite and doesn’t satisfy the property, then it can’t be regular.  uv i x ∈ L n w ∈ L |w| ≥ n uvx = w i ≥ 0 |v| ≥ 1 |uv| ≤ n  uv i x ∉ L

35 Decision Algorithms for Regular Sets Suppose L is given by a FA M (no ε transitions) with start state s. Let → be the DAG of M (ignore transition labels). Then L M ≠ Ø iff s →* f for some final state f. |L M | = ∞ iff s →* q →⁺ q →* f for some state q. Equivalence: L₁ = L₂ iff (L₁ ⊆ L₂ and L₂ ⊆ L₁) ⇔ (L₁ ∪ L₂) ∖ (L₁ ∩ L₂) = Ø.

36 Closure Properties Example Using closure properties to prove non-regularity Show {a n ba n : n ≥ 1} is not regular Define h 1 (a) = ah 1 (b) = bah 1 (c) = a h 2 (a) = 0h 2 (b) = 1h 2 (c) = 1 h 1 −1 ({a n ba n : n ≥ 1} ⊆ {(a + c) n b(a + c) n−1 : n ≥ 1} so h 1 −1 ({a n ba n : n ≥ 1} ∩ a*bc* = {a n bc n−1 : n ≥ 1} and h 2 ({a n bc n−1 : n ≥ 1} = {0 n 11 n−1 : n ≥ 1} = {0 n 1 n : n ≥ 1} not regular

37 Decision Algorithms for Regular Sets If L is regular: Does L = 0 (Ø)? Does L = 1 (Ø*)? Is L finite? For a regular expression, this can be answered recursively: Basis:Ø = Ø ; a ≠ Ø Induction:r + s = Ø  r = Ø and s = Ø r* ≠ Ør∙s = Ø  r = Ø or s = Ø Basis:Ø ≠ Ø* ; a ≠ Ø* Induction:r + s = Ø*  r or s = Ø* and the other = Ø or Ø* r∙s = Ø*  r = Ø* = s r* = Ø*  r = Ø* or r = Ø Basis:|Ø| < ∞ ; |a| < ∞ Induction:|r + s| < ∞  |r| and |s| are both < ∞ |r∙s| < ∞  r or s = Ø or |r| and |s| < ∞ |r*| < ∞  r = Ø or Ø* = Ø? | | < ∞ ?

38 Simplifying regular expressions Fact: If we let Ø* = ε, then every non-empty regular expression can be written without the use of the empty set. Reason: Ø can be removed bottom-up from every sub- expression because it behaves like the additive identity and multiplicative zero. The only exception is Kleene star, Ø*. Fact: Every regular language without ε can be written without the use of ε. Reason: Ø* can be removed top-down from every sub- expression without ε because it behaves like the multiplicative identity. Fact: Once these exceptional cases are removed, a regular expression denotes an infinite language iff it contains a Kleene star (*). 38

39 Decision algorithm for Regular Sets (classical treatment) Assume all regular languages are represented by a DFA M with n states. (1) L M is nonempty   w, |w| < n, w ∈ L M Proof: (1) (  ) obvious ( ⇒ ) Let w be a minimal length word accepted by M. If |w| ≥ n, then by the pumping lemma, w = uvx, |v| ≥ 1, and ux ∈ L M which contradicts minimality of |w|. Therefore |w| < n.

40 (2) L M is infinite   w, n ≤ |w| < 2n, w ∈ L M Proof: (2) (  ) If  w ∈ L M, n ≤ |w| < 2n, then by the pumping lemma, w = uvx and uv i x ∈ L M  i ≥ 0 (|v| ≠ 0), which implies L M is infinite. ( ⇒ ) Suppose L M is infinite, with w ∈ L M of minimal length ≥ n. If |w| ≥ 2n, then by pumping lemma, w = uvx 1 ≤ |v| ≤ n. Then ux ∈ L M, |ux| n, which is a contradiction. □ (3) Equivalence: There is an algorithm to determine if L M 1 = L M 2 Proof: (L M 1 ∩ L′ M 2 ) ∪ (L′ M 1 ∩ L M 2 ) = Ø  L M 1 = L M 2 □ Decision algorithm for Regular Sets (classical treatment), cont.


Download ppt "Chapter 2 Regular Languages & Finite Automata. Regular Expressions A finitary denotation of a regular language over . ØL Ø = Ø aL a = {a} where a ∈ "

Similar presentations


Ads by Google