Distinct regular expressions may represent the same language:

Distinct regular expressions may represent the same language:
a+b and b+a represent the same language {a, b}. Two expressions R and S that represent the same language, L(R) = L(S), are considered equal. For example, a + a*= a* because L(a + a*) = L(a*) = {, a, aa, aaa, …} Example. Simplify a regular expression:  +ab +abab(ab)* = (ab)* L ={, ab, abab, ababab, …} aa (b*+a)+a(ab*+aa) = aa (b*+a) a(a+b)*+aa(a+b)*+aaa(a+b)* = a(a+b)* To prove it we can show that aa(a+b)*  a (a+b)(a+b)* a(a+b)* and aaa(a+b)*  a(a+b)*. Then use A B  AB =B

Properties of Regular Expressions
R+T=T+R R+=+R=R R+R=R (R+S)+T=R+(S+T)  L(R)  L(T) = L(R) L(T)  L(R)  =L(R)  L(R)L(R) = L(R)  (L(R)L(S))L(T) = L(R)(L(S)L(T)) 2) ‘’ properties of regular expressions R=R= R=R=R (RS)T=R(ST) 3) distributive properties of regular expressions R(S+T)=RS+RT (S+T)R=SR +ST

4) closure properties *=*= R*=R*R*=(R*)*=R+R* R*=+ R*=(+ R*)*=(+R) R*=+R R* R*=(R+…+ Rk)* for any k1 R*=+ R+ R2+…+ Rk1+Rk R* for any k1 R*R=R R* (R+S)*=(R*+S*)*=(R*S*)*=(R*S)*R*=R*(SR*)* R(SR)*=(RS)*R (R*S)*=+(R+S)*S (RS*)*=+R(R+S)*

Each of this properties can be proved.
For example, let's prove that R* = R*R*. Proof. We need to prove two inclusion properties, i) R* R* R* and ii) R*  R* R*. i) To prove R*R* R* it is sufficient to note that for any expression S (understand: for any set of strings, described by expression S) S  S R* because  R*. ii) To prove R* R* R* let's take arbitrary string w R* R* ……..(1) to prove that wR*. (1)  w=uv, where uR* and vR*……………….. (2) (2)  uRn and vRm for some integer n and m. Then w=uvRn+m  R*. So, we proved both subset relations, i.e. two regular expressions are equal, R*=R*R*.

Let's prove one more property of regular expressions,
(R+S)*=(R*S)*R*. Proof. We understand the equality of two expressions as the equality of two sets of strings denoted by these expressions. So, we are going to prove two subset relations: i) (R+S)* (R*S)*R* and ii) (R*S)*R* (R+S)*. i). Take arbitrary string w(R+S)*  w(R+S)n for some integer n0.  w=u1u2…un, where uiR+S for i=1, 2,…n. uiR+S  uiR or uiS. Denote any substring uiR as ui=r and a substring uiS as ui=s. Then string w is a sequence of substrings r and s, like w=rrrssrsssrsrr = rrrssrsssrsrr= v1v2v3v4v5v6 t, , where vjR*S, w=v1v2…vkt, where tR So, any string w(R*S)*R*.

In the same way we may prove that ii) (R*S)*R* (R+S)*.
(left as an exercise). Example. Using properties of regular expressions prove the equality ba*(baa*)* = b( a + ba )*. S R R* We can prove the equality by using the property (R+S)* = R*(SR*)* Take R = a, S =ba, then (a+ba)*= (R+S)*= R*(SR*)* = a*(baa*)*

We can also establish simple rules, that can be used in proofs
by ‘double inclusion’. Let A, B and C be sets of strings, then 1) A  B  AC  BC 2) A  B  A* B* 3)   B  A  AB and A  BA Then we can prove a(a+b)*+aa(a+b)*+aaa(a+b)*= a(a+b)* by proving aa(a+b)* a(a+b)* and aaa(a+b)* a(a+b)* a  a+b  (a+b)* a(a+b)* (a+b)(a+b)*  (a+b)*(a+b)* = (a+b)* aa(a+b)* a(a+b)* aaa(a+b)* aa(a+b)*  a(a+b)*

Example. Prove that ( + a+b*a)*b* = (a + b)*.
We can show two subset relations: i) ( + a+b*a)*b*  (a + b)* and ii) (a + b)*  ( + a+b*a)*b* i) ( + a+b*a)*b*  (a + b)* b a+b b*(a+b)*   (a+b)* a a+b(a+b)* (+ a+ b*a) (a+b)* b*a (a+b)* (a+b)* = (a+b)* (+ a+ b*a)*(a+b)**= (a+b)* Finally, (+ a+ b*a)* b* (a+b)* (a+b)*= (a + b)*

ii) (a + b)*  ( + a+b*a)*b*
(a + b)*= (b*a)* b* by the rule (R+S)*=(R*S)*R*  ( + a+b*a)*b* by b*a  ( +a+ b*a)

Deterministic Finite Automata (DFA)
DFA is a recognizer for regular languages. They model the behavior of real computing devices which are designed to distinguish a correct input over a given alphabet. recognizer for L * DFA w* Accept (wL) Reject (wL) This abstract machine (DFA) is a device that reads an input string, one symbol at a time and decides whether the string belongs to the language or not (accept or reject).

DFA includes: alphabet finite nonempty set of “states” transition function defined for each state and on each symbol start states accepting states The DFA can be depicted as a directed graph, where vertices represent states and each edge is labeled by the input symbol and dictates how the machine changes its state on reading this symbol.

Example. Construct a DFA to recognize the regular language over
alphabet {a, b} described by regular expression L(ab*a). So, we need to find a DFA that is able to distinguish between strings, that belong to L(ab*a) and strings that do not. Transition function b a b q1 q2 (q0, a)=q1, (q0, b)=q2 q0 w (q1, a)=q3, (q1, b)=q1 q3 a (q2, a)=q2, (q2, b)=q2 a, b “sink state” a, b (q3, a)=q2, (q3, b)=q2

DFA  L (ab*a) DFA consists of: a b q1 q2 “sink state” a, b q3 q0 w Alphabet ={a, b} Set of states: Q={q0, q1, q2, q3} including q0 - start state q3 - accepting state Transition function (qi, ak) that assigns the nest state on reading any ak  for each qi Q

Assume w = abbaa enters the DFA.
q1 q2 “sink state” a, b q3 q0 w By reading an input DFA goes through sequence of configurations: abbaa q0 abbaa q1 a (q0, a)=q1 abbaa q1 b (q1, b)=q1 abbaa q1 b (q1, b)=q1 abbaa q3 a (q1, a)=q3 abbaa q2 a (q3, a)=q2

The configuration is a pair of a state and remaining input,
(qi, w): abbaa q0 abbaa q1 a abbaa q1 b abbaa q1 b abbaa q3 a abbaa q2 a (q0, abbaa)  (q1, bbaa)  (q1, baa)  (q1, aa)  (q3, a)  (q2, ) A string is accepted by a DFA if and only if on the reading this string the DFA comes to the configuration (qa, ), where qa is an accepting state.

The string is accepted (recognized to be in the language)
if DFA comes to accepting state after reading the input string Instead of using transition function (qi, ak) we can give the equivalent transition table. a b q1 q2 “sink state” a, b q3 q0 w a b q0 q1 q2 q1 q3 q1 q2 q2 q2 q3 q2 q2

Inductive proofs on strings.
Usually induction is done on the length of a string |w| =n, or the number of repetition of some pattern. Prove that the regular expression R =(ab+b)*(+a) describes the language L {a, b}* , consisting of all strings that do not contain aa. Proof. To prove the equality of two sets of strings, L and R, we can prove two subset relations, RL and LR i) R L , we need to prove that for any string w [wR  wL] Assume wR =(ab+b)*(+a)  w (ab+b)n (+a), for some n0 Prove by induction on n0 , that for any w (ab+b)n (+a)  w  L.

Prove by induction on n0 , that for any w(ab+b)n(+a)
 w  L. Basis. n=0, w(+a), we have either w =  or w = a. In both cases w  L, because it does not contain aa. IH. Assume that for n=k, k 0, any string from the set s(ab+b)k(+a) belongs to L. IS. We need to prove that any string w(ab+b)k+1(+a) belongs to L. w(ab+b)k+1(+a)  w(ab+b)s , where s(ab+b)k(+a), either w=abs or w=bs , in both cases w does not contain aa since s does not contain aa by IH.

ii) Take any w  L and prove that wR =(ab+b)*(+a).
Let’s prove it by induction on the length |w|=n 0 Basis. n=0, w= ,  R =(ab+b)*(+a). IH. Assume that for n=k, k 0, we have that any string v L with length |v| k belongs to R. IS. We need to prove that any string from L with length k+1 belongs to R. Take w L, |w|=k+1. We can consider two cases: 1) w=as or 2) w=bs. In the first case w L  s=bu, where u L, and by IH u R, since |u|= k1<k , i. e. u (ab+b)*(+a). Then w = abu ab(ab+b)*(+a)  (ab+b)*(+a).

In the second case, w=bs, where s L and |s|=k, so
s R =(ab+b)*(+a) by IH. Then w b(ab+b)*(+a)  (ab+b)*(+a).

Distinct regular expressions may represent the same language:

Similar presentations

Presentation on theme: "Distinct regular expressions may represent the same language:"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distinct regular expressions may represent the same language:

Similar presentations

Presentation on theme: "Distinct regular expressions may represent the same language:"— Presentation transcript:

Similar presentations

About project

Feedback