Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lexical Analysis — Part II From Regular Expression to Scanner Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.

Similar presentations


Presentation on theme: "Lexical Analysis — Part II From Regular Expression to Scanner Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled."— Presentation transcript:

1 Lexical Analysis — Part II From Regular Expression to Scanner Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. COMP 412 FALL 2010

2 Comp 412, Fall 20101 Quick Review Last class: —The scanner is the first stage in the front end —Specifications can be expressed using regular expressions —Build tables and code from a DFA Scanner Generator specifications Scanner source codeparts of speech & words Specifications written as “regular expressions” Represent words as indices into a global table tables or code design time compile time

3 Comp 412, Fall 20102 Quick Review of Regular Expressions All strings of 1s and 0s ending in a 1 All strings over lowercase letters where the vowels (a,e,i,o, & u) occur exactly once, in ascending order All strings of 1s and 0s that do not contain three 0s in a row: ( 1 * (  |01 | 001 ) 1 * ) * (  | 0 | 00 ) ( 0 | 1 ) * 1 Let Cons be (b|c|d|f|g|h|j|k|l|m|n|p|q|r|s|t|v|w|x|y|z) Cons * a Cons * e Cons * i Cons * o Cons * u Cons *

4 Comp 412, Fall 20103 Consider the problem of recognizing ILOC register names Register  r (0|1|2| … | 9) (0|1|2| … | 9) * Allows registers of arbitrary number Requires at least one digit RE corresponds to a recognizer (or DFA ) Transitions on other inputs go to an error state, s e Example ( from Lab 1 ) S0S0 S2S2 S1S1 r (0|1|2| … 9) accepting state (0|1|2| … 9) Recognizer for Register

5 Comp 412, Fall 20104 DFA operation Start in state S 0 & make transitions on each input character DFA accepts a word x iff x leaves it in a final state (S 2 ) So, r17 takes it through s 0, s 1, s 2 and accepts r takes it through s 0, s 1 and fails a takes it straight to s e Example ( continued ) S0S0 S2S2 S1S1 r (0|1|2| … 9) accepting state (0|1|2| … 9) Recognizer for Register

6 Comp 412, Fall 20105 Example ( continued ) To be useful, the recognizer must be converted into code  r 0,1,2,3,4, 5,6,7,8,9 All others s0s0 s1s1 sese sese s1s1 sese s2s2 sese s2s2 sese s2s2 sese sese sese sese sese Char  next character State  s 0 while (Char  EOF ) State   (State,Char) Char  next character if (State is a final state ) then report success else report failure Skeleton recognizer Table encoding the RE O(1) cost per character (or per transition) S0S0 S2S2 S1S1 r (0|1|2| … 9)

7 Comp 412, Fall 20106 Example ( continued ) We can add “actions” to each transition  r 0,1,2,3,4, 5,6,7,8,9 All others s0s0 s 1 start s e error s e error s1s1 s e error s 2 add s e error s2s2 s e error s 2 add s e error sese s e error s e error s e error Char  next character State  s 0 while (Char  EOF ) Next   (State,Char) Act   (State,Char) perform action Act State  Next Char  next character if (State is a final state ) then report success else report failure Skeleton recognizerTable encoding RE Typical action is to capture the lexeme

8 Comp 412, Fall 20107 r Digit Digit * allows arbitrary numbers Accepts r00000 Accepts r99999 What if we want to limit it to r0 through r31 ? Write a tighter regular expression —Register  r ( (0|1|2) (Digit |  ) | (4|5|6|7|8|9) | (3|30|31) ) —Register  r0|r1|r2| … |r31|r00|r01|r02| … |r09 Produces a more complex DFA DFA has more states DFA has same cost per transition (or per character) DFA has same basic implementation What if we need a tighter specification?

9 Comp 412, Fall 20108 Tighter register specification (continued) The DFA for Register  r ( (0|1|2) (Digit |  ) | (4|5|6|7|8|9) | (3|30|31) ) Accepts a more constrained set of register names Same set of actions, more states S0S0 S5S5 S1S1 r S4S4 S3S3 S6S6 S2S2 0,1,20,1,2 3 0,10,1 4,5,6,7,8,94,5,6,7,8,9 (0|1|2| … 9)

10 Comp 412, Fall 20109 Tighter register specification (continued)  r0,1234-9 All others s0s0 s1s1 sese sese sese sese sese s1s1 sese s2s2 s2s2 s5s5 s4s4 sese s2s2 sese s3s3 s3s3 s3s3 s3s3 sese s3s3 sese sese sese sese sese sese s4s4 sese sese sese sese sese sese s5s5 sese s6s6 sese sese sese sese s6s6 sese sese sese sese sese sese sese sese sese sese sese sese sese Table encoding RE for the tighter register specification This table runs in the same skeleton recognizer This table uses the same O(1) time per character The extra precision costs us table space, not time S0S0 S5S5 S1S1 r S4S4 S3S3 S6S6 S2S2 0,1,20,1,2 3 0,10,1 4,5,6,7,8,94,5,6,7,8,9 (0|1|2| … 9)

11 Comp 412, Fall 201010 Tighter register specification (continued) State Action r0,123 4,5,6 7,8,9 other 0 1 start eeeee 1e 2 add 2 add 5 add 4 add e 2e 3 add 3 add 3 add 3 add e exit 3,4eeeee e exit 5e 6 add eee e exit 6eeeee x exit eeeeeee S0S0 S5S5 S1S1 r S4S4 S3S3 S6S6 S2S2 0,1,20,1,2 3 0,10,1 4,5,6,7,8,94,5,6,7,8,9 (0|1|2| … 9) We care about path lengths (time) and finite size of set of states (representability), but we don’t worry (much) about number of states.

12 Comp 412, Fall 201011 Where are we going? We will show how to construct a finite state automaton to recognize any RE Overview: —Direct construction of a nondeterministic finite automaton (NFA) to recognize a given RE  Easy to build in an algorithmic way  Requires  -transitions to combine regular subexpressions —Construct a deterministic finite automaton (DFA) to simulate the NFA  Use a set-of-states construction —Minimize the number of states in the DFA  Hopcroft state minimization algorithm —Generate the scanner code  Additional specifications needed for the actions Introduce NFA s Optional, but worthwhile

13 Comp 412, Fall 201012 Non-deterministic Finite Automata What about an RE such as ( a | b ) * abb ? Each RE corresponds to a deterministic finite automaton ( DFA ) We know a DFA exists for each RE The DFA may be hard to build directly Automatic techniques will build it for us … S0S0 S1S1 S3S3 S2S2 b b b b a a a a Nothing here that would change the O(1) cost per transition

14 Comp 412, Fall 201013 Non-deterministic Finite Automata Here is a simpler RE for ( a | b ) * abb This recognizer is more intuitive Structure seems to follow the RE ’s structure This recognizer is not a DFA S 0 has a transition on  S 1 has two transitions on a This is a non-deterministic finite automaton ( NFA ) a | b S0S0 S1S1 S4S4 S2S2 S3S3  abb This NFA needs one more transition, at O(1) cost per transition

15 Comp 412, Fall 201014 Non-deterministic Finite Automata An NFA accepts a string x iff  a path though the transition graph from s 0 to a final state such that the edge labels spell x, ignoring  ’s Transitions on  consume no input To “run” the NFA, start in s 0 and guess the right transition at each step —Always guess correctly —If some sequence of correct guesses accepts x then accept Why study NFA s? They are the key to automating the RE  DFA construction We can paste together NFA s with  -transitions NFA becomes an NFA 

16 Comp 412, Fall 201015 Relationship between NFA s and DFA s DFA is a special case of an NFA DFA has no  transitions DFA ’s transition function is single-valued Same rules will work DFA can be simulated with an NFA —Obviously NFA can be simulated with a DFA ( less obvious ) Simulate sets of possible states Possible exponential blowup in the state space Still, one state per character in the input stream Rabin & Scott, 1959

17 Comp 412, Fall 201016 Automating Scanner Construction To convert a specification into code: 1Write down the RE for the input language 2Build a big NFA 3Build the DFA that simulates the NFA 4Systematically shrink the DFA 5Turn it into code Scanner generators Lex and Flex work along these lines Algorithms are well-known and well-understood Key issue is interface to parser ( define all parts of speech ) You could build one in a weekend!

18 Comp 412, Fall 201017 Where are we? Why are we doing this? RE  NFA ( Thompson’s construction ) Build an NFA for each term Combine them with  -moves NFA  DFA ( Subset construction ) Build the simulation DFA  Minimal DFA  Hopcroft’s algorithm DFA  RE All pairs, all paths problem Union together paths from s 0 to a final state minimal DFA RENFADFA The Cycle of Constructions

19 Comp 412, Fall 201018 RE  NFA using Thompson’s Construction Key idea NFA pattern for each symbol & each operator Join them with  moves in precedence order S0S0 S1S1 a NFA for a S0S0 S1S1 a S3S3 S4S4 b NFA for ab  NFA for a | b S0S0 S1S1 S2S2 a S3S3 S4S4 b S5S5    S0S0 S1S1  S3S3 S4S4  NFA for a * a   Ken Thompson, CACM, 1968

20 Comp 412, Fall 201019 Example of Thompson’s Construction Let’s try a ( b | c ) * 1. a, b, & c 2. b | c 3. ( b | c ) * S0S0 S1S1 a S0S0 S1S1 b S0S0 S1S1 c S2S2 S3S3 b S4S4 S5S5 c S1S1 S6S6 S0S0 S7S7      S1S1 S2S2 b S3S3 S4S4 c S0S0 S5S5    

21 Comp 412, Fall 201020 Example of Thompson’s Construction (con’t) 4. a ( b | c ) * Of course, a human would design something simpler... S0S0 S1S1 a b | c But, we can automate production of the more complex NFA version... S0S0 S1S1 a  S4S4 S5S5 b S6S6 S7S7 c S3S3 S8S8 S2S2 S9S9     

22 Comp 412, Fall 201021 Where are we? Why are we doing this? RE  NFA ( Thompson’s construction ) Build an NFA for each term Combine them with  -moves NFA  DFA ( subset construction )  Build the simulation DFA  Minimal DFA  Hopcroft’s algorithm DFA  RE All pairs, all paths problem Union together paths from s 0 to a final state minimal DFA RENFADFA The Cycle of Constructions

23 Comp 412, Fall 201022 NFA  DFA with Subset Construction Need to build a simulation of the NFA Two key functions Move(s i, a) is the set of states reachable from s i by a  -closure(s i ) is the set of states reachable from s i by  The algorithm: Start state derived from s 0 of the NFA Take its  -closure S 0 =  -closure({s 0 }) Take the image of S 0, Move(S 0,  ) for each   , and take its  -closure Iterate until no more states are added Sounds more complex than it is… Rabin & Scott, 1959

24 Comp 412, Fall 201023 NFA  DFA with Subset Construction The algorithm: s 0  -closure ( {n 0 } ) S  { s 0 } W  { s 0 } while ( W ≠ Ø ) select and remove s from W for each    t   - closure ( Move ( s,  )) T[s,  ]  t if ( t  S ) then add t to S add t to W Let’s think about why this works The algorithm halts: 1. S contains no duplicates (test before adding) 2. 2 {NFA states} is finite 3. while loop adds to S, but does not remove from S (monotone)  the loop halts S contains all the reachable NFA states It tries each character in each s i. It builds every possible NFA configuration.  S and T form the DFA This test is a little tricky s 0 is a set of states S & W are sets of sets of states

25 Comp 412, Fall 201024 NFA  DFA with Subset Construction The algorithm: s 0  -closure ( {n 0 } ) S  { s 0 } W  { s 0 } while ( W ≠ Ø ) select and remove s from W for each    t   - closure ( Move ( s,  )) T[s,  ]  t if ( t  S ) then add t to S add t to W Let’s think about why this works The algorithm halts: 1. S contains no duplicates (test before adding) 2. 2 {NFA states} is finite 3. while loop adds to S, but does not remove from S (monotone)  the loop halts S contains all the reachable NFA states It tries each character in each s i. It builds every possible NFA configuration.  S and T form the DFA Any DFA state containing a final state of the NFA final state becomes a final state of the DFA.

26 Comp 412, Fall 201025 NFA  DFA with Subset Construction Example of a fixed-point computation Monotone construction of some finite set Halts when it stops adding to the set Proofs of halting & correctness are similar These computations arise in many contexts Other fixed-point computations Canonical construction of sets of LR(1) items —Quite similar to the subset construction Classic data-flow analysis (& Gaussian Elimination) —Solving sets of simultaneous set equations We will see many more fixed-point computations

27 Comp 412, Fall 201026 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

28 Comp 412, Fall 201027 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

29 Comp 412, Fall 201028 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

30 Comp 412, Fall 201029 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

31 Comp 412, Fall 201030 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

32 Comp 412, Fall 201031 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

33 Comp 412, Fall 201032 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

34 Comp 412, Fall 201033 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

35 Comp 412, Fall 201034 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3

36 Comp 412, Fall 201035 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 none s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none

37 Comp 412, Fall 201036 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3 q 7 is the core state of s 3

38 Comp 412, Fall 201037 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3 q 5 is the core state of s 2

39 Comp 412, Fall 201038 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 q 1, q 2, q 3, q 4, q 6, q 9 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none q 5, q 8, q 9, q 3, q 4, q 6 q 7, q 8, q 9, q 3, q 4, q 6 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3 Final states because of q 9

40 Comp 412, Fall 201039 NFA  DFA with Subset Construction a ( b | c )* : q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 s1s1 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none s2s2 s3s3 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3 Transition table for the DFA

41 Comp 412, Fall 201040 NFA  DFA with Subset Construction The DFA for a ( b | c ) * Much smaller than the NFA ( no  -transitions ) All transitions are deterministic Use same code skeleton as before s3s3 s2s2 s0s0 s1s1 c b a b c c b S0S0 S1S1 a b | c But, remember our goal: a bc s0s0 s1s1 none s1s1 s2s2 s3s3 s2s2 s2s2 s3s3 s3s3 s2s2 s3s3

42 Comp 412, Fall 201041 Where are we? Why are we doing this? RE  NFA ( Thompson’s construction ) Build an NFA for each term Combine them with  -moves NFA  DFA ( subset construction ) Build the simulation DFA  Minimal DFA  Hopcroft’s algorithm DFA  RE All pairs, all paths problem Union together paths from s 0 to a final state Not enough time to teach Hopcroft’s algorithm today minimal DFA RENFADFA The Cycle of Constructions

43 Comp 412, Fall 201042 Alternative Approach to DFA Minimization The Intuition The subset construction merges prefixes in the NFA s0s0 s 10 s9s9 s8s8 s5s5 s7s7 s6s6 s3s3 s2s2 s1s1 s4s4    a b a b c d c abc | bc | ad Thompson’s construction would leave  -transitions between each single- character automaton s0s0 s6s6 s4s4 s5s5 s2s2 s1s1 s3s3 a b b c d c Subset construction eliminates  - transitions and merges the paths for a. It leaves duplicate tails, such as bc.

44 Comp 412, Fall 201043 Alternative Approach to DFA Minimization Idea: use the subset construction twice For an NFA N —Let reverse(N) be the NFA constructed by making initial states final (& vice-versa) and reversing the edges —Let subset(N) be the DFA that results from applying the subset construction to N —Let reachable(N) be N after removing all states that are not reachable from the initial state Then, reachable(subset(reverse[reachable(subset(reverse(N))])) is the minimal DFA that implements N [ Brzozowski, 1962 ] This result is not intuitive, but it is true. Neither algorithm dominates the other.

45 Comp 412, Fall 201044 Alternative Approach to DFA Minimization Step 1 The subset construction on reverse(NFA) merges suffixes in original NFA s 11    Reversed NFA s0s0 s 10 s9s9 s8s8 s5s5 s7s7 s6s6 s3s3 s2s2 s1s1 s4s4    a b a b c d c s 11 s9s9 s8s8 s3s3 s2s2 s1s1 a a b d c subset(reverse(NFA))

46 Comp 412, Fall 201045 Alternative Approach to DFA Minimization Step 2 Reverse it again & use subset to merge prefixes … Reverse it, again s 11 s9s9 s8s8 s3s3 s2s2 s1s1 a a b d c s0s0    And subset it, again s 11 s3s3 s2s2 a b d c s0s0 b Minimal DFA minimal DFA RENFADFA The Cycle of Constructions Brzozowski


Download ppt "Lexical Analysis — Part II From Regular Expression to Scanner Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled."

Similar presentations


Ads by Google