Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2 Scanning From Regular Expression to DFA Gang S.Liu College of Computer Science & Technology Harbin Engineering University.

Similar presentations


Presentation on theme: "Chapter 2 Scanning From Regular Expression to DFA Gang S.Liu College of Computer Science & Technology Harbin Engineering University."— Presentation transcript:

1 compilerSamuel2005@126.com1 Chapter 2 Scanning From Regular Expression to DFA Gang S.Liu College of Computer Science & Technology Harbin Engineering University

2 compilerSamuel2005@126.com2 From Regular Expression to DFA Regular expression NFADFAProgram

3 compilerSamuel2005@126.com3 From a Regular Expression to NFA  The construction we will describe is know as Thompson ’ s construction.  It uses ε-transitions to “ glue together ” the machines of each piece of a regular expression.

4 compilerSamuel2005@126.com4 Basic Regular Expression a ε a ε Φ

5 compilerSamuel2005@126.com5 Concatenation  Clearly, this machine accepts L(rs) = L(r)L(s) and corresponds to the regular expression rs r…r… r…r… s…s… ε NFA for a regular expression r s…s… NFA for a regular expression s NFA for a regular expression rs

6 compilerSamuel2005@126.com6 Choice among Alternatives  We added a new start state and a new accepting state using ε-transitions.  This machine accepts L( r | s ) = L(r) ∪ L(s). r…r… s…s… ε ε ε ε

7 compilerSamuel2005@126.com7 Repetition  This machine corresponds to r*. r…r… ε ε ε ε

8 compilerSamuel2005@126.com8 Example 2.12  Translate the regular expression ab|a into a NFA. ab ab ε aεε εε ab ε

9 compilerSamuel2005@126.com9 Example 2.13  letter(letter|digit)* letterdigit letter digit letter ε ε ε ε ε ε ε ε ε

10 compilerSamuel2005@126.com10 From Regular Expression to DFA Regular expression NFADFAProgram

11 compilerSamuel2005@126.com11 From NFA to DFA  We need some method for eliminating ε-transitions and multiple transitions from a state on a single input character.  Eliminating ε-transitions involves the construction of ε-closures.  Eliminating multiple transitions involves keeping track of the set of the states instead of single states.

12 compilerSamuel2005@126.com12 ε - Closure  ε-closure of a single state s is the set of states reachable by zero or more ε-transitions. We denote this set by s.  ε-closure of a state always contains the state itself.

13 compilerSamuel2005@126.com13 Example 2.14  a*  1 =  2 =  3 =  4 = ε ε a ε ε 1 234 {1, 2, 4} {2} {2, 3, 4} {4}

14 compilerSamuel2005@126.com14 ε-Closure of Set of States  ε-closure of a set of states is defined as the union of ε-closures of each individual state.  If S = {s 1, s 2, … s n } is a set of states, then S = s 1 ∪ s 2 ∪ … ∪ s n  In the previous example we had 1 = {1, 2, 4} and 3 = {2, 3, 4} Let S = {1, 3} S = {1, 3} = 1 ∪ 3 = {1, 2, 4} ∪ {2, 3, 4} = {1, 2, 3, 4}

15 compilerSamuel2005@126.com15 The Subset Construction  Given NFA M.  Need to construct a corresponding DFA M ’. 1.Compute ε-closure of the start state of M; this becomes the start state of M ’. 2.For this set and for each subsequent set, we compute transitions on character a as follows 1.Given a set of states S and a character a, compute the set of states S ’ a = {t | for some s in S there is a transition from s to t on a} 2.Compute S ’ a, the ε-closure of S ’ a. This becomes a new state. There is a transition from S to S ’ a on the character a. 3.Continue with this process until no new states or transitions are created. 4.Mark as accepting those constructed states that contain an accepting state of M.

16 compilerSamuel2005@126.com16 Example 2.15 11 ε ε a ε ε 1234 1,2,4 2,3,4 a a  {1, 2, 4} a  {2, 3, 4} a = {1, 2, 4} – start state = {3}= {2, 3, 4} – new state and transition = {3}= {2, 3, 4} - no new state, new transition

17 compilerSamuel2005@126.com17 Example 2.16 ab ε a 8 εε ε ε 1 2345 67 3,4,7,8 1,2,6 5,8 a b

18 compilerSamuel2005@126.com18 Example 2.17 letter digit letter ε ε ε ε ε ε ε ε ε 1 234 56 7 8 910 1 2,3,4,5,7,10 4,5,6,7,9,10 4,5,7,8,9,10 letter digit

19 compilerSamuel2005@126.com19 From Regular Expression to DFA Regular expression NFADFAProgram

20 compilerSamuel2005@126.com20 Minimizing Number of States  The resulting DFA may be more complex than necessary. In Example 2.15 we got DFA for a*, but there is a more simple DFA.  Important result from automata theory: For any given DFA there is an equivalent DFA containing a minimum number of states, and it is unique. 1,2,42,3,4 a a a

21 compilerSamuel2005@126.com21 Minimizing Number of States Algorithm 1.Create two sets of states: all accepting states and all non-accepting states. 2.For each set of states, consider the transitions on each character a of the alphabet. If all states in the set have transitions on a to the same set of states, then it defines a- transition from the set of states to itself. If there are two states in the set s and t that have transitions on a that land in different sets, we must split the set of states into two sets according to where a-transitions land. 3.Repeat step 2 until either all states contain one element or no further splitting occurs.

22 compilerSamuel2005@126.com22 Example 2.18 1 2,3,4,5,7,10 4,5,6,7,9,10 4,5,7,8,9,10 letter digit letter digit

23 compilerSamuel2005@126.com23 Example 2.19  (a|ε)b*  All states are accepting states. Each accepting state has a b-transition to another accepting state.  a distinguishes state 1 from states 2 and 3. There is a-transition to error state from 2 and 3 12 3 a b b b 12,3 a b b b

24 compilerSamuel2005@126.com24 DFA for Special Symbols  All special symbols except assignment are distinct single characters.  If we use a variable to indicate the token type, all accepting states can be collapsed into one state DONE. + - ; return PLUS return MINUS … return SEMI

25 compilerSamuel2005@126.com25 Adding Numbers and Identifiers START INID INNUM DONE digit letter [other] + - * / = < ( ) ;

26 compilerSamuel2005@126.com26 Adding White Space, Comments, and Assignment START INID INNUM DONE digit letter [other] + - * / = < ( ) ; other INASSIGN INCOMMENT { } other : = [other] white space

27 compilerSamuel2005@126.com27 Homework  2.12  a. Use Tompson ’ s construction to convert the regular expression ( a | b )* a ( a | b | ε ) into an NFA.  b. Convert the NFA of part(a) into a DFA using the subset construction.


Download ppt "Chapter 2 Scanning From Regular Expression to DFA Gang S.Liu College of Computer Science & Technology Harbin Engineering University."

Similar presentations


Ads by Google