Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Expressions Hopcroft, Motawi, Ullman, Chap 3.

Similar presentations


Presentation on theme: "Regular Expressions Hopcroft, Motawi, Ullman, Chap 3."— Presentation transcript:

1 Regular Expressions Hopcroft, Motawi, Ullman, Chap 3

2 Machines versus expressions Finite automata are machine-like descriptions of languages Alternative: declarative description Specifying a language using expressions and operations Example: 01* + 10* defines the language containing strings such as 01111, 100, 0, 1000000; * and + are operators in this “algebra”

3 Regular expressions defined The simplest regular expressions are The empty string  A symbol a from an alphabet Given that R 1 and R 2 are regular expressions, regular expressions are built from the following operations Union: R 1 +R 2 Concatenation: R 1 R 2 Closure: R 1 * Parentheses (to enforce precedence): (R 1 ) Nothing else is a regular expression unless it is built from the above rules

4 Examples 1 +  (ab)* + (ba)* (0+1+2+3+4+5+6+7+8+9)*(0+5) (x+y)*x(x+y)* (01)* 0(1*) 01* equivalent to 0(1*)

5 Equivalence between regular expressions and finite automata Strategy: Convert regular expression to an  -NFA Convert a DFA to a regular expression

6 Regular Expression to  -NFA Recursive construction Base cases follow base case definitions of regular expressions  :  -NFA that accepts the empty string – a single state that is the start and end state a:  -NFA that accepts {a} – two-state machine (start and final state) with an a-transition Note: technically, we also need an  -NFA for the empty language {} – easy

7 Regular Expression to  -NFA Recursive step: build  -NFA from smaller  - NFAs that correspond to the operand regular expressions To simplify construction, we may ensure the following characteristics for the automata we build Only one final state, with no outgoing transitions No transitions into the start state Note: the base cases satisfy these characteristics

8 Regular Expression to  -NFA Suppose  -NFA 1 and  -NFA 2 are the automata for R 1 and R 2 Three operations to worry about: union R 1 + R 2, concatenation R 1 R2), closure R 1 * With  -transitions, construction is straightforward Union: create a new start state, with  -transitions into the start states of  -NFA 1 and  -NFA 2 ; create a new final state, with  -transitions from the two final states of  -NFA 1 and  - NFA 2 Concatenation:  -transition from final state of  -NFA 1 to the start state of  -NFA 2 Closure: closure can be supported by an  -transition from final to start state; need a few more  -transitions (why?)

9 DFA to Regular Expression More difficult construction Build the regular expression “bottom up” starting with simpler strings that are acceptable using a subset of states in the DFA Define R k i,j as the expression for strings that have an admissible state sequence from state i to state j with no intermediate states greater than k Assume no states are numbered 0, but k can be 0

10 R 0 i,j Observe that R 0 i,j describes strings of length 1 or 0, particularly: {a 1, a 2, a 3, … }, where, for each a x,  (i,a x ) = j Add  to the set if i = j The 0 in R 0 i,j means no intermediate states are allowed, so either no transition is made (just stay in state i to accept  if i = j) or make a single transition from state i to state j These are the base cases in our construction

11 R k i,j Recursive step: for each k, we can build R k i,j as follows: R k i,j = R k-1 i,j + R k-1 i,k (R k-1 k,k )* R k-1 k,j Intuition: since the accepting sequence contains one or more visits to state k, break the path into pieces that first goes from i to its first k-visit (R k-1 i,k ) followed by zero or more revisits to k (R k-1 k,k ) followed by a path from k to j (R k-1 k,j )

12 And finally… We get the regular expression(s) that represent all strings with admissible sequences that start with the initial state (state 1) and end with a final state Resulting regular expression built from the DFA: the union of all R n 1,f where f is a final state Note: n is the number of states in the DFA meaning there are no more restrictions for intermediate states in the accepting sequence


Download ppt "Regular Expressions Hopcroft, Motawi, Ullman, Chap 3."

Similar presentations


Ads by Google