1 Module 11 Proving more specific problems are not solvable Input transformation technique –Use subroutine theme to show that if one problem is unsolvable,

1 Module 11 Proving more specific problems are not solvable Input transformation technique –Use subroutine theme to show that if one problem is unsolvable, so is a second problem –Need to clearly differentiate between use of program as a subroutine and a program being an input to another program

2 Basic Idea/Technique

3 Primality Testing Problem Consider the following two problems –Halting Problem Input: Program P, unsigned x Yes/No Question: Does P halt on x? –Primality Testing Problem (PTP) Input: Program P, unsigned x Yes/No Question: Does P output correctly whether or not x is a prime number? Which problem seems harder and why?

4 Question Suppose we construct a program P H which solves the Halting problem H under the following conditions –All of P H is known to be correct with the exception of one procedure P L. –This procedure P L is being used to solve the Primality Testing Problem. What can we conclude in this scenario?

5 Formalizing Technique Assume P L is a procedure that solves problem L –We have no idea how P L solves L Construct a program P H that solves H using P L as a subroutine –We use P L as a black box –(We could use any unsolvable problem in place of H) Argue P H solves H Conclude that L is unsolvable –Otherwise P L would exist and then H would be solvable –L will be a problem about program behavior

6 Focusing on H In this module, we will typically use H, the Halting Problem, as our known unsolvable problem The technique generalizes to using any unsolvable problem L’ in place of H. –You would need to change the proofs to work with L’ instead of H, but in general it can be done The technique also can be applied to solvable problems to derive alternative consequences We focus on H to simplify the explanation

7 Constructing P H using P L Answer-preserving input transformations and Program P T

8 P H has two subroutines There are many ways to construct P H using program P L that solves L We focus on one method in which P H consists of two subroutines –Procedure P L that solves L –Procedure P T which computes a function f that I call an answer-preserving (or answer-reversing) input transformation More about this in a moment

9 Pictoral Representation of P H * PHPH xYes/No PLPL Y/NPTPT P T (x)

10 Answer-preserving input transformation P T Input –An input to H Output –An input to L such that yes inputs of H map to yes inputs of L no inputs of H map to no inputs of L Note, P T must not loop when given any legal input to H

11 Why this works * PHPH PLPL PTPT yes input to Hyes input to L yes no input to Hno input to L no We have assumed that P L solves L

12 Answer-reversing input transformation P T Input –An input to H Output –An input to L such that yes inputs of H map to no inputs of L no inputs of H map to yes inputs of L Note, P T must not loop when given any legal input to H

13 Why this works PHPH PLPL PTPT yes input to Hno input to L yes no input to Hyes input to L no We have assumed that P L solves L no yes

14 Yes->Yes and No->No Domain of H Yes inputs for H No inputs for H Yes inputs for L No inputs for L Domain of L PLPL PTPT PHPH x P T (x) Yes/No

15 Notation and Terminology If there is such an answer-preserving (or answer-reversing) input transformation f (and the corresponding program P T ), we say that H transforms to (many-one reduces to) L Notation H <= L Domain of H Yes inputsNo inputs Yes inputsNo inputs Domain of L

16 Examples not involving the Halting Problem

17 Generalization As noted earlier, while we focus on transforming H to other problems, the concept of transformation generalizes beyond H and beyond unsolvable program behavior problems We work with some solvable, language recognition problems to illustrate some aspects of the transformation process in the next few slides

18 Example 1 L 1 is the set of even length strings over {0,1} –What are the set of legal input instances and no inputs for the L 1 LRP? L 2 is the set of odd length strings over {0,1} –Same question as above Tasks –Give an answer-preserving input transformation f that shows that L 1 LRP <= L 2 LRP –Give a corresponding program P T that computes f Domain of L 1 Yes inputsNo inputs Yes inputsNo inputs Domain of L 2

19 Program P T string main(string x) { return(x concatenate “0”); }

20 Example 2 L 1 is the set of all strings over {0,1} –What is the set of all inputs, yes inputs, no inputs for the L 1 LRP? L 2 is {0} –Same question as above Tasks –Give an answer-preserving input transformation f which shows that the L 1 LRP <=L 2 LRP –Give a corresponding program P T which computes f Domain of L 1 Yes inputsNo inputs Yes inputsNo inputs Domain of L 2

21 Program P T string main(string x) { return( “0”); }

22 Example 3 L 1 –Input: Java program P that takes as input an unsigned int –Yes/No Question: Does P halt on all legal inputs L 2 –Input: C++ program P that takes as input an unsigned int –Yes/No Question: Does P halt on all legal inputs Tasks –Describe what an answer-preserving input transformation f that shows that L 1 <=L 2 would be/do? Domain of L 1 Yes inputsNo inputs Yes inputsNo inputs Domain of L 2

23 Proving a program behavior problem L is unsolvable

24 Problem Definitions * Halting Problem H –Input Program Q H that has one input of type unsigned int non-negative integer y that is input to program Q H –Yes/No Question Does Q H halt on y? Target Problem L –Input Program Q L that has one input of type string –Yes/No question Does Y(Q L ) = the set of even length strings? Assume program P L solves L

25 Construction review PHPH xYes/No We are building a program P H to solve the halting problem H PTPT P T (x) P H will use P T as a subroutine, and we must explicitly construct P T using specific properties of H and L PLPL Y/N P H will use P L as a subroutine, and we have no idea how P L accomplishes its task

26 P’s and Q’s Programs which are PART of program P H and thus “executed” when P H executes –Program P T, an actual program we construct –Program P L, an assumed program which solves problem L Programs which are INPUTS/OUTPUTS of programs P H, P L, and P T and which are not “executed” when P H executes –Programs Q H, Q L, and Q YL code for Q YL is available to P T

27 Two inputs for L * Target Problem L –Input Program Q that has one input of type string –Yes/No question Does Y(Q) = the set of even length strings? Program P L –Solves L –We don’t know how Consider the following program Q 1 bool main(string z) {while (1>0) ;} –What does P L output when given Q 1 as input? Consider the following program Q 2 bool main(string z) { if ((z.length %2) = = 0) return (yes) else return (no); } –What does P L output when given Q 2 as input?

28 Another input for L * Target Problem L –Input Program Q that has one input of type string –Yes/No question Does Y(Q) = the set of even length strings? Program P L –Solves L –We don’t know how Consider the following program Q L with 2 procedures Q 1 and Q YL bool main(string z) { Q 1 (5); /* ignore return value */ return(Q YL (z)); } bool Q 1 (unsigned x) { if (x > 3) return (no); else loop; } bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return(no); } What does P L output when given Q L as input?

29 Input and Output of P T * Input of P T (Also Input of H) –Program Q H one input of type unsigned int –Non-negative integer y Program Q L that is the output of P T (Also input of L) bool main(string z) { Q H (y); /* Q H and y come left-hand side */ /* ignore return value */ return(Q YL (z)); } bool Q H (unsigned x) { /* comes from left-hand side } bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return(no); } QH,yQH,y PTPT QLQL

30 Example 1 * Input to P T Program Q H bool main(unsigned y) { if (y ==5) return yes; else if (y ==4) return no; else while (1>0) {}; } Input y 5 Output of P T Program Q L bool Q H (unsigned y) { if (y ==5) return yes; else if (y ==4) return no; else while (1>0) {}; } bool Q YL (string z) { if ((z.length % 2) == 0) return (yes) else return (no); } bool main(string z) { unsigned y = 5; Q H (y); return (Q YL (z)); } Q H,y PTPT QLQL

31 Example 2 Input to P T Program Q H bool main(unsigned y) { if (y ==5) return yes; else if (y ==4) return no; else while (1>0) {}; } Input y 3 Output of P T Program Q L bool Q H (unsigned y) { if (y ==5) return yes; else if (y ==4) return no; else while (1>0) {}; } bool Q YL (string z) { if ((z.length % 2) == 0) return (yes) else return (no); } bool main(string z) { unsigned y = 3; Q H (y); return (Q YL (z)); } Q H,y PTPT QLQL

32 P T in more detail

33 Declaration of P T What is the return type of P T ? –Type program1 (with one input of type string) What are the input parameters of P T ? –The same as the input parameters to H; in this case, type program2 (with one input of type unsigned int) unsigned int (input type to program2) program1 main(program2 Q H, unsigned y) PLPL PTPT PHPH Q H,y QLQL Yes/No

34 program1 main(program2 P, unsigned y) { /* Will be viewing types program1 and program2 as STRINGS over the program alphabet  P */ program1 Q L = replace-main-with-Q H (P); /* Insert line break */ Q L += “\n”; /* Insert Q YL */ Q L += “bool Q YL (string z) {\n \t if ((z.length % 2) == 0) return (yes) else return (no);\n }”; /* Add main routine of Q L */ Q L += “bool main(string z) {\n\t”; /* determined by L */ Q L += “unsigned y =” Q L += convert-to-string(y); Q L += “;\n\t Q H (y)\n\t return(Q YL (z));\n}”; return(Q L ); } program1 replace-main-with-Q H (program2 P) /* Details hidden */ string convert-to-string(unsigned y) /* Details hidden */ Code for P T PLPL PTPT PHPH Q H,y QLQL Yes/No

35 P T in action P T code for Q YL QHQH unsigned y start Q YL Y/N z QLQL halt QHQH y Program Q H bool main(unsigned y) { if (y ==5) return yes; else if (y ==4) return no; else while (1>0) {}; } Input y 5 Program Q L bool Q H (unsigned y) { if (y ==5) return yes; else if (y ==4) return no; else while (1>0) {}; } bool Q YL (string z) { if ((z.length % 2) == 0) return (yes) else return (no); } bool main(string z) { unsigned y = 5; Q H (y); return (Q YL (z)); } PTPT Q YL PLPL PTPT PHPH Q H,y QLQL Yes/No

36 Constructing Q L (and thus P T )

37 Start with no input for H If Q H, y is a no input to the Halting problem Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q ?L (z)); /* yes or no? */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q ?L (string y) { } –Thus Y(Q L ) = {} –Q H loops on y –Determine if this makes Q L a no or yes input instance to L

38 Answer-preserving input transformation If Q H, y is a no input to the Halting problem –Thus Y(Q L ) = {} –Q H loops on y –Determine if this makes Q L a no or yes input instance to L Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q YL (string y) { } –Now choose a Q YL (or Q NL ) that is a yes (or no) input instance to L

39 Make yes for H map to yes for L If Q H, y is a no input to the Halting problem –Thus Y(Q L ) = {} –Q H loops on y –Determine if this makes Q L a no or yes input instance to L –Now choose a Q YL (or Q NL ) that is a yes (or no) input instance to L Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

40 Possible shortcut Program Q L bool main(string z) { Q H (y); /* ignore return value */ if ((z.length( ) % 2) = = 0) return (yes); else return (no); } bool Q H (unsigned x) { /* comes from left-hand side } Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

41 Another Example

42 Problem Definitions Halting Problem H –Input Program Q H that has one input of type unsigned int non-negative integer y that is input to program Q H –Yes/No Question Does Q H halt on y? Target Problem L –Input Program Q L that has one input of type string –Yes/No question Is Y(Q L ) finite? Assume program P L solves L

43 Start with no input for H If Q H, y is a no input to the Halting problem Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q ?L (z)); /* yes or no? */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q ?L (string y) { } –Thus Y(Q L ) = {} –Q H loops on y –Determine if this makes Q L a no or yes input instance to L

44 Answer-reversing input transformation If Q H, y is a no input to the Halting problem Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q NL (z)); /* no */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q NL (string y) { } –Thus Y(Q L ) = {} –Q H loops on y –Determine if this makes Q L a no or yes input instance to L –Now choose a Q YL (or Q NL ) that is a yes (or no) input instance to L

45 Make yes for H map to no for L If Q H, y is a no input to the Halting problem Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q NL (z)); /* no */ } bool Q H (unsigned x) { /* comes from left-hand side } bool Q NL (string y) { if ((y.length( ) % 2) = = 0) return(yes); else return(no); } –Thus Y(Q L ) = {} –Q H loops on y –Determine if this makes Q L a no or yes input instance to L –Now choose a Q YL (or Q NL ) that is a yes (or no) input instance to L

46 Analyzing proposed transformations 4 possibilities

47 Problem Setup Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); } Problem L Input: Program P Yes/No Question: Is Y(P) = {aa}? Question: Is the transformation on the left an answer- preserving or answer- reversing input transformation from H to problem L?

48 Key Step Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); } Problem L Input: Program P Yes/No Question: Is Y(P) = {aa}? The output of the transformation is the input to the problem. Plug Q L in for program P above Is Y(Q L ) = {aa}?

49 Is Y(Q L ) = {aa}? Problem L Input: Program P Yes/No Question: Is Y(P) = {aa}? Analysis If Q H loops on x, Y(Q L )={} No input to H creates a Q L that is a no input for L If Q H halts on x, Y(Q L ) = {even length strings} Yes input to H creates a Q L that is a no input for L Transformation does not work All inputs map to no inputs Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

50 Three other problems Problem L 1 Input: Program P Yes/No Question: Is Y(P) infinite? Problem L 2 Input: Program P Yes/No Question: Is Y(P) finite? Problem L 3 Input: Program P Yes/No Question: Is Y(P) = {} or is Y(P) infinite? Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

51 Is Y(P) infinite? Problem L 1 Input: Program P Yes/No Question: Is Y(P) infinite? Analysis If Q H loops on x, Y(Q L )={} No input to H creates a Q L that is a no input for L If Q H halts on x, Y(Q L ) = {even length strings} Yes input to H creates a Q L that is a yes input for L Transformation works Answer-preserving Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

52 Is Y(P) finite? Problem L 2 Input: Program P Yes/No Question: Is Y(P) finite? Analysis If Q H loops on x, Y(Q L )={} No input to H creates a Q L that is a yes input for L If Q H halts on x, Y(Q L ) = {even length strings} Yes input to H creates a Q L that is a no input for L Transformation works Answer-reversing Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

53 Is Y(P) = {} or is Y(P) infinite? Problem L 3 Input: Program P Yes/No Question: Is Y(P) = {} or is Y(P) infinite? Analysis If Q H loops on x, Y(Q L )={} No input to H creates a Q L that is a yes input for L If Q H halts on x, Y(Q L ) = {even length strings} Yes input to H creates a Q L that is a yes input for L Transformation does not work All inputs map to yes inputs Input of Transformation Program Q H, unsigned x Output of Transformation Program Q L bool main(string z) { Q H (y); /* ignore return value */ return(Q YL (z)); /* yes or no */ } bool Q H (unsigned x) {} bool Q YL (string y) { if ((y.length( ) % 2) = = 0) return (yes); else return (no); }

54 Module 12 Computation and Configurations –Formal Definition –Examples

55 Definitions Configuration –Functional Definition Given the original program and the current configuration of a computation, someone should be able to complete the computation –Contents of a configuration for a C++ program current instruction to be executed current value of all variables Computation –Complete sequence of configurations

56 Computation 1 1 int main(int x,y) { 2 int r = x % y; 3 if (r== 0) goto 8; 4 x = y; 5 y = r; 6 r = x % y; 7 goto 3; 8 return y; } Input: 10 3 Line 1, x=?,y=?,r=? Line 2, x=10, y=3,r=? Line 3, x=10, y=3, r=1 Line 4, x=10, y=3, r=1 Line 5, x= 3, y=3, r=1 Line 6, x=3, y=1, r=1 Line 7, x=3, y=1, r=0 Line 3, x=3, y=1, r=0 Line 8, x=3, y=1, r=0 Output is 1

57 Computation 2 int main(int x,y) { 2 int r = x % y; 3 if (r== 0) goto 8; 4 x = y; 5 y = r; 6 r = x % y; 7 goto 3; 8 return y; } Input: 53 10 Line 1, x=?,y=?,r=? Line 2, x=53, y=10, r=? Line 3, x= 53, y=10, r=3 Line 4, x=53, y=10, r=3 Line 5, x=10, y=10, r=3 Line 6, x=10, y=3, r=3 Line 7, x=10, y=3, r=1 Line 3, x=10, y=3, r=1...

58 Computations 1 and 2 Line 1, x=?,y=?,r=? Line 2, x=53, y=10, r=? Line 3, x= 53, y=10, r=3 Line 4, x=53, y=10, r=3 Line 5, x=10, y=10, r=3 Line 6, x=10, y=3, r=3 Line 7, x=10, y=3, r=1 Line 3, x=10, y=3, r=1... Line 1, x=?,y=?,r=? Line 2, x=10, y=3,r=? Line 3, x=10, y=3, r=1 Line 4, x=10, y=3, r=1 Line 5, x= 3, y=3, r=1 Line 6, x=3, y=1, r=1 Line 7, x=3, y=1, r=0 Line 3, x=3, y=1, r=0 Line 8, x=3, y=1, r=0 Output is 1

59 Observation int main(int x,y) { 2 int r = x % y; 3 if (r== 0) goto 8; 4 x = y; 5 y = r; 6 r = x % y; 7 goto 3; 8 return y; } Line 3, x= 10, y=3, r=1 Program and current configuration –Together, these two pieces of information are enough to complete the computation –Are they enough to determine what the original input was? No! Both previous inputs, 10 3 as well as 53 10 eventually reached the same configuration (Line 3, x=10, y=3, r=1)

60 Module 13 Studying the internal structure of REC, the set of solvable problems –Complexity theory overview –Automata theory preview Motivating Problem –string searching

61 Studying REC Complexity Theory Automata Theory

62 Current picture of all languages  ll Languages RE-REC  ll languages - RE Half Solvable Not even half solvable Which language class should be studied further? REC Solvable

63 Complexity Theory * In complexity theory, we differentiate problems by how hard a problem is to solve –Remember, all problems in REC are solvable Which problem is harder and why? –Max: Input: list of n numbers Task: return largest of the n numbers –Element Input: list of n numbers Task: return any of the n numbers REC RE - REC All languages - RE

64 Resource Usage * How do we formally measure the hardness of a problem? –We measure the resources required to solve input instances of the problem Typical resources are? Need a notion of size of an input instance –Obviously larger input instances require more resources to solve

65 Poly Language Class * Informal Definition: A problem L 1 is easier than problem L 2 if problem L 1 can be solved in less time than problem L 2. Poly: the set of problems which can be solved in polynomial time (typically referred to as P, not Poly) Major goal: Identify whether or not a problem belongs to Poly Poly Rest of REC REC RE - REC All languages - RE

66 Working with Poly PolyRest of REC How do you prove a problem L is in Poly? How do you prove a problem L is not in Poly? –We are not very good at this. –For a large class of interesting problems, we have techniques (polynomial-time answer- preserving input transformations) that show a problem L probably is not in Poly, but few which prove it.

67 Examples Shortest Path Problem Input –Graph G –nodes s and t Task –Find length of shortest path from s to t in G Longest Path Problem Input –Graph G –nodes s and t Task –Find length of longest path from s to t in G PolyRest of REC Which problem is provably solvable in polynomial time?

68 Automata Theory In automata theory, we will define new models of computation which we call automata –Finite State Automata (FSA) –Pushdown Automata (PDA) Key concept –FSA’s and PDA’s are restricted models of computation FSA’s and PDA’s cannot solve all the problems that C++ programs can –We then identify which problems can be solved using FSA’s and PDA’s REC RE - REC All languages - RE

69 New language classes REC is the set of solvable languages when we start with a general model of computation like C++ programs We want to identify which problems in REC can be solved when using these restricted automata Rest of REC Solvable by FSA’s and PDA’s Solvable by PDA’s REC RE - REC All languages - RE

70 Recap * Complexity Theory –Studies structure of the set of solvable problems –Method: analyze resources (processing time) used to solve a problem Automata Theory –Studies structure of the set of solvable problems –Method: define automata with restricted capabilities and resources and see what they can solve (and what they cannot solve) –This theory also has important implications in the development of programming languages and compilers

71 Motivating Problem String Searching

72 String Searching Input –String x –String y Tasks –Return location of y in string x –Does string y occur in string x? Can you identify applications of this type of problem in real life? Try and develop a solution to this problem.

73 String Searching II Input –String x –pattern y Tasks –Return location of y in string x –Does pattern y occur in string x? Pattern –[anything].html –$EN4$$ Try and develop a solution to this problem.

74 String Searching We will show an easy way to solve these string searching problems In particular, we will show that we can solve these problems in the following manner –write down the pattern –the computer automatically turns this into a program which performs the actual string search

75 Module 14 Regular languages –Inductive definitions –Regular expressions syntax semantics

76 Regular Languages (Regular Expressions)

77 Regular Languages New language class –Elements are languages We will show that this language class is identical to LFSA –Language class to be defined by Finite State Automata (FSA) –Once we have shown this, we will use the term “regular languages” to refer to this language class

78 Inductive Definition of Integers * Base case definition –0 is an integer Inductive case definition –If x is an integer, then x+1 is an integer x-1 is an integer Completeness –Only numbers generated using the above rules are integers

79 Inductive Definition of Regular Languages Base case definition –Let  denote the alphabet –{} is a regular language –{a} is a regular language for any character a in  Inductive case definition –If L 1 and L 2 are regular languages, then L 1 union L 2 is a regular language L 1 concatenate L 2 is a regular language L 1 * is a regular language Completeness –Only languages generated using above rules are regular languages

80 Proving a language is regular * Prove that {aa, bb} is a regular language –{a} and {b} are regular languages base case of definition –{aa} = {a}{a} is a regular language concatenation rule –{bb} = {b}{b} is a regular language concatenation rule –{aa, bb} = {aa} union {bb} is a regular language union rule Typically, we will not go through this process to prove a language is regular

81 Regular Expressions How do we describe a regular language? –Use set notation {aa, bb, ab, ba}* {a}{a,b}*{b} –Use regular expressions R Inductive def of regular languages and regular expressions on page 72 (aa+bb+ab+ba)* a(a+b)*b

82 R and L(R) * How we interpret a regular expression –What does a regular expression R mean to us? aaba represents the regular language {aaba}  represents the regular language {} aa+bb represents the regular language {aa, bb} –We use L(R) to denote the regular language represented by regular expression R.

83 Precedence rules * What is L(ab+c*)? –Possible answers: {a}({b} union {c}*} ({a}{b,c})* ({ab} union {c})* {ab} union {c}* –Must know precedence rules * first, then concatenation, then +

84 Precedence rules continued Precedence rules similar to those for arithmetic expressions –ab+c 2 (a times b) + (c times c) exponentiation first, then multiplication, then addition Think of Kleene closure as exponentiation, concatenation as multiplication, and union as addition and the precedence rules are identical

85 Regular expressions are strings * Let L be a regular language over the alphabet  –A regular expression R for L is just a string over the alphabet  union {(, ), +, *,  } which follows certain syntactic rules That is, the set of legal regular expressions is itself a language over the alphabet  union {(, ), +, *} – , a*aba are strings in the language of legal reg. exp. –)(, *a* are strings NOT in the language of legal reg. exp.

86 Semantics * We give a regular expression R meaning when we interpret it to represent L(R). –aaba is just a string – we interpret it to represent the language {aaba}. We do similar things with arithmetic expressions –10+7 2 is just a string –We interpret this string to represent the number 59

87 Key fact * A language L is a regular language iff there exists a reg. exp. R such that L(R) = L –When I ask for a proof that a language L is regular, rather than going through the inductive proof we saw earlier, I expect you to give me a regular expression R s.t. L(R) = L

88 Summary * Regular expressions are strings –syntax for legal regular expressions –semantics for interpreting regular expressions Regular languages are a new language class –A language L is regular iff there exists a regular expression R s.t. L(R) = L We will show that the regular languages are identical to LFSA

89 Module 15 FSA’s –Defining FSA’s –Computing with FSA’s Defining L(M) –Defining language class LFSA –Comparing LFSA to set of solvable languages (REC)

90 Finite State Automata New Computational Model

91 Tape We assume that you have already seen FSA’s in CSE 260 –If not, review material in reference textbook Only data structure is a tape –Input appears on tape followed by a B character marking the end of the input –Tape is scanned by a tape head that starts at leftmost cell and always scans to the right

92 Data type/States The only data type for an FSA is char The instructions in an FSA are referred to as states Each instruction can be thought of as a switch statement with several cases based on the char being scanned by the tape head

93 Example program 1 switch(current tape cell) { case a: goto 2 case b: goto 2 case B: return yes } 2 switch (current tape cell) { case a: goto 1 case b: goto 1 case B: return no; }

94 New model of computation FSA M=(Q, ,q 0,A,  ) –Q = set of states = {1,2} –  = character set = {a,b} don’t need B as we see below –q 0 = initial state = 1 –A = set of accepting (final) states = {1} A is the set of states where we return yes on B Q-A is set of states that return no on B –  = state transition function 1 switch(current tape cell) { case a: goto 2 case b: goto 2 case B: return yes } 2 switch (current tape cell) { case a: goto 1 case b: goto 1 case B: return no; }

95 Textual representations of  * 1 switch(current tape cell) { case a: goto 2 case b: goto 2 case B: return yes } 2 switch (current tape cell) { case a: goto 1 case b: goto 1 case B: return no; } 1 2 ab 11 22  (1,a) = 2,  (1,b)=2,  (2,a)=1,  (2,b) = 1 {(1,a,2), (1,b,2), (2,a,1), (2,b,1)}

96 Transition diagrams 1 2 a,b 1 switch(current tape cell) { case a: goto 2 case b: goto 2 case B: return yes } 2 switch (current tape cell) { case a: goto 1 case b: goto 1 case B: return no; } Note, this transition diagram represents all 5 components of an FSA, not just the transition function 

97 Exercise * FSA M = (Q, , q 0, A,  ) –Q = {1, 2, 3} –  = {a, b} –q 0 = 1 –A = {2,3} –  : {  (1,a) = 1,  (1,b) = 2,  (2,a)= 2,  (2,b) = 3,  (3,a) = 3,  (3,b) = 1} Draw this FSA as a transition diagram

98 Transition Diagram 1 2 3 a a a b b b

99 Computing with FSA’s

100 Computation Example * 1 2 3 a a a b b b Input: aabbaa

101 Computation of FSA’s in detail A computation of an FSA M on an input x is a complete sequence of configurations We need to define –Initial configuration of the computation –How to determine the next configuration given the current configuration –Halting or final configurations of the computation

102 Given an FSA M and an input string x, what is the initial configuration of the computation of M on x? –(q 0,x) –Examples x = aabbaa (1, aabbaa) x = abab (1, abab) x = (1, ) Initial Configuration 1 2 3 a a a b b b FSA M

103 (1, aabbaa) |- M (1, abbaa) –config 1 “yields” config 2 in one step using FSA M (1,aabbaa) |- M (2, baa) –config 1 “yields” config 2 in 3 steps using FSA M (1, aabbaa) |- M (2, baa) –config 1 “yields” config 2 in 0 or more steps using FSA M Comment: –|- M determined by transition function  –There must always be one and only one next configuration If not, M is not an FSA Definition of |- M 1 2 3 a a a b b b 3 FSA M *

104 Halting configuration –(q, ) –Examples (1, ) (3, ) Accepting Configuration –State in halting configuration is in A Rejecting Configuration –State in halting configuration is not in A Halting Configurations * 1 2 3 a a a b b b FSA M

105 Two possibilities for M running on x –M accepts x M accepts x iff the computation of M on x ends up in an accepting configuration (q 0, x) |- M (q, ) where q is in A –M rejects x M rejects x iff the computation of M on x ends up in a rejecting configuration (q 0, x) |- M (q, ) where q is not in A –M does not loop or crash on x FSA M on x * b b b a a a FSA M * *

106 –For the following input strings, does M accept or reject? aa aabba aab babbb Examples b b b a a a FSA M

107 Notation from the book  (q, c) = p  k (q, x) = p –k is the length of x  * (q, x) = p Examples –  (1, a) = 1 –  (1, b) = 2 –  4 (1, abbb) = 1 –  * (1, abbb) = 1 –   (2, baaaaa) = 3 Definition of  * (q, x) 1 2 3 a a a b b b FSA M

108 L(M) or Y(M) –The set of strings M accepts Basically the same as Y(P) from previous unit –We say that M accepts/decides/recognizes/solves L(M) Remember an FSA will not loop or crash –What is L(M) (or Y(M)) for the FSA M above? N(M) –Rarely used, but it is the set of strings M rejects LFSA –L is in LFSA iff there exists an FSA M such that L(M) = L. L(M) and LFSA * b b b a a a FSA M

109 LFSA Unit Overview Study limits of LFSA –Understand what languages are in LFSA Develop techniques for showing L is in LFSA –Understand what languages are not in LFSA Develop techniques for showing L is not in LFSA Prove Closure Properties of LFSA Identify relationship of LFSA to other language classes

110 Comparing language classes Showing LFSA is a subset of REC, the set of solvable languages

111 LFSA subset REC Proof –Let L be an arbitrary language in LFSA –Let M be an FSA such that L(M) = L M exists by definition of L in LFSA –Construct C++ program P from FSA M –Argue P solves L –There exists a C++ program P which solves L –L is solvable

112 Visualization LFSA REC FSA’s C++ Programs L L M P Let L be an arbitrary language in LFSA Let M be an FSA such that L(M) = L M exists by definition of L in LFSA Construct C++ program P from FSA M Argue P solves L There exists a program P which solves L L is solvable

113 Construction The construction is an algorithm which solves a problem with a program as input –Input to A: FSA M –Output of A: C++ program P such that P solves L(M) –How do we do this? Construction Algorithm FSA M Program P

114 Comparing computational models The previous slides show one method for comparing the relative power of two different computational models –Computational model CM 1 is at least as general or powerful as computational model CM 2 if Any program P 2 from computational model CM 2 can be converted into an equivalent program P 1 in computational model CM 1. –Question: How can we show two computational models are equivalent?

115 Module 16 Distinguishability –Definition –Help in designing/debugging FSA’s

116 Distinguishability

117 Questions Let L be the set of strings over {a,b} which end with aaba. Let M be an FSA such that L(M) = L. Questions –Can aaba and aab end up in the same state of M? Why or why not? –How about aa and aab? –How about or a? –How about b or bb? –How about  or bbab?

118 Definition * String x is distinguishable from string y with respect to language L iff there exists a string z such that –xz is in L and yz is not in L OR –xz is not in L and yz is in L When reviewing, identify the z for pair of strings on the previous slide

119 Questions Let L be the set of strings over {a,b} that have length 2 mod 5 or 4 mod 5. Let M be an FSA such that L(M) = L. Questions –Are aa and aab distinguishable with respect to L? Can they end up in the same state of M? –How about aa and aaba? –How about and a? –How about b and aabbaa?

120 One design method –Is in L? Implication? –Is a distinguishable from  wrt L? Implication? –Is b distinguishable from  wrt L? Implication? –Is b distinguishable from  a wrt L? Implication? L = set of strings x over {a,b} such that length of x is 2 or 4 mod 5 Design an FSA to accept L

121 Design continued –Is aa distinguishable from  wrt L? Implication? –Is aa distinguishable from  a wrt L? Implication? L = set of strings x over {a,b} such that length of x is 2 or 4 mod 5 Design an FSA to accept L

122 Design continued –What strings would we compare ab to? –What results do we get? –Implications? –How about ba? –How about bb? L = set of strings x over {a,b} such that length of x is 2 or 4 mod 5 Design an FSA to accept L

123 Design continued –We can continue in this vein, but it could go on forever –Now lets try something different –Consider string. What set of strings are indistinguishable from it wrt L? Implications? L = set of strings x over {a,b} such that length of x is 2 or 4 mod 5 Design an FSA to accept L

124 Design continued –Consider string a. What set of strings are indistinguishable from it wrt L? Implications? –Consider string aa. What set of strings are indistinguishable from it wrt L? Implications? L = set of strings x over {a,b} such that length of x is 2 or 4 mod 5 Design an FSA to accept L

125 Debugging an FSA Do essentially the same thing –Identify some strings which end up in each state –Try and generalize each state to describe the language of strings which end up at that state.

126 Example 1 aaa a,b b b b b a IIIIIIIVV VI

127 Example 2 IIIIIIIV V aaa a,b b b b b a

128 Example 3 I II III IV a b a a b b a,b

129 Module 17 Closure Properties of Language class LFSA –Remember ideas used in solvable languages unit –Set complement –Set intersection, union, difference, symmetric difference

130 LFSA is closed under set complement If L is in LFSA, then L c is in LFSA Proof –Let L be an arbitrary language in LFSA –Let M be the FSA such that L(M) = L M exists by definition of L in LFSA –Construct FSA M’ from M –Argue L(M’) = L c –There exists an FSA M’ such that L(M’) = L c –L c is in LFSA

131 Visualization Let L be an arbitrary language in LFSA Let M be the FSA such that L(M) = L M exists by definition of L in LFSA Construct FSA M’ from M Argue L(M’) = L c L c is in LFSA LcLc L LFSA FSA’s M M’

132 Construct FSA M’ from M What did we do when we proved that REC, the set of solvable languages, is closed under set complement? Construct program P’ from program P Can we translate this to the FSA setting?

133 Construct FSA M’ from M M = (Q, , q 0, A,  ) M’ = (Q’,  ’, q’, A’,  ’) –M’ should say yes when M says no –M’ should say no when M says yes –How? Q’ = Q  ’ =  q’ = q 0  ’ =  A’ = Q-A

134 Example 1 2 3 a a a b b b FSA M 1 2 3 a a a b b b FSA M’ Q’ = Q  ’ =  q’ = q 0  ’ =  A’ = Q-A

135 Construction is an algorithm * Set Complement Construction –Algorithm Specification Input: FSA M Output: FSA M’ such that L(M’) = L(M) c –Comments This algorithm can be in any computational model. –It does not have to be (and typically is not) an FSA These set closure constructions are useful. –More on this later Construction Algorithm FSA M FSA M’

136 Specification of the algorithm Your algorithm must give a complete specification of M’ in terms of M –Example: Let input FSA M = (Q, , q 0, A,  ) Output FSA M’ = (Q’,  ’, q’, A’,  ’) where –Q’ = Q –  ’ =  –q’ = q 0 –  ’ =  –A’ = Q-A When I ask for such a construction algorithm specification, this type of answer is what I am looking for. Further algorithmic details on how such an algorithm would work are unnecessary. Construction Algorithm FSA M FSA M’

137 LFSA closed under Set Intersection Operation (also set union, set difference, and symmetric difference)

138 LFSA closed under set intersection operation * Let L 1 and L 2 be arbitrary languages in LFSA Let M 1 and M 2 be FSA’s s.t. L(M 1 ) = L 1, L(M 2 ) = L 2 –M 1 and M 2 exist by definition of L 1 and L 2 in LFSA Construct FSA M 3 from FSA’s M 1 and M 2 Argue L(M 3 ) = L 1 intersect L 2 There exists FSA M 3 s.t. L(M 3 ) = L 1 intersect L 2 L 1 intersect L 2 is in LFSA

139 Visualization Let L 1 and L 2 be arbitrary languages in LFSA Let M 1 and M 2 be FSA’s s.t. L(M 1 ) = L 1, L(M 2 ) = L 2 M 1 and M 2 exist by definition of L 1 and L 2 in LFSA Construct FSA M 3 from FSA’s M 1 and M 2 Argue L(M 3 ) = L 1 intersect L 2 There exists FSA M 3 s.t. L(M 3 ) = L 1 intersect L 2 L 1 intersect L 2 is in LFSA L 1 intersect L 2 L1L1 L2L2 LFSA M3M3 M1M1 M2M2 FSA’s

140 Algorithm Specification Input –Two FSA’s M 1 and M 2 Output –FSA M 3 such that L(M 3 ) = L(M 1 ) intersection L(M 2 ) FSA M 1 FSA M 2 FSA M 3 Alg

141 Use Old Ideas Key concept: Try ideas from previous closure property proofs Example –How did the algorithm that was used to prove that REC is closed under set intersection work? –If we adapt this approach, what should M 3 do with respect to M 1, M 2, and the input string? FSA M 1 FSA M 2 FSA M 3 Alg

142 1 Run M 1 and M 2 Simultaneously 0 2 0 0 0 0 1 1 1 1 M1M1 A B 0 10,1 M2M2,A 0,A1,A2,A,B 0,B1,B2,B M3M3 What happens when M 1 and M 2 run on input string 11010? FSA M 1 FSA M 2 FSA M 3 Alg

143 Construction * Input –FSA M 1 = (Q 1,  , q 1,  , A 1 ) –FSA M 2 = (Q 2,  , q 2,  , A 2 ) Output –FSA M 3 = (Q 3,  , q 3,  , A 3 ) –What is Q 3 ? Q 3 = Q 1 X Q 2 where X is cartesian product In this case, Q 3 = {(,A), (,B), (0,A), (0,B), (1,A), (1,B), (2,A), (2,B)} –What is  3 ?  3 =  1 =  2 In this case,  3 = {0,1} 1 0 2 0 0 0 0 1 1 1 1 M1M1 A B 0 10,1 M2M2

144 Construction * Input –FSA M 1 = (Q 1,  , q 1,  , A 1 ) –FSA M 2 = (Q 2,  , q 2,  , A 2 ) Output –FSA M 3 = (Q 3,  , q 3,  , A 3 ) –What is q 3 ? q 3 = (q 1, q 2 ) In this case, q 3 = (,A) –What is A 3 ? A 3 = {(p, q) | p in A 1 and q in A 2 } In this case, A 3 = {(0,B)} 1 0 2 0 0 0 0 1 1 1 1 M1M1 A B 0 10,1 M2M2

145 Construction Input –FSA M 1 = (Q 1,  , q 1,  , A 1 ) –FSA M 2 = (Q 2,  , q 2,  , A 2 ) Output –FSA M 3 = (Q 3,  , q 3,  , A 3 ) –What is  3 ? For all p in Q 1, q in Q 2, a in ,  3 ((p,q),a) = (  1 (p,a),  2 (q,a)) In this case, –  3 ((0,A),0) = (  1 (0,0),  2 (A,0)) – = (0,B) –  3 ((0,A),1) = (  1 (0,1),  2 (A,1)) – = (1,A) 1 0 2 0 0 0 0 1 1 1 1 M1M1 A B 0 10,1 M2M2

146 Example Summary 1 0 2 0 0 01 1 1 M1M1 A B 0 10,1 M2M2,A 0,A1,A2,A,B 0,B 1,B2,B M3M3 01 0 1 0 1 0 1 0 1 0 1 0 1

147 Observation Input –FSA M 1 = (Q 1,  , q 1,  , A 1 ) –FSA M 2 = (Q 2,  , q 2,  , A 2 ) Output –FSA M 3 = (Q 3,  , q 3,  , A 3 ) –What is A 3 ? A 3 = {(p, q) | p in A 1 and q in A 2 } What if operation were different? –Set union, set difference, symmetric difference

148 Observation continued * Input –FSA M 1 = (Q 1,  , q 1,  , A 1 ) –FSA M 2 = (Q 2,  , q 2,  , A 2 ) Output –FSA M 3 = (Q 3,  , q 3,  , A 3 ) –What is A 3 ? Set intersection: A 3 = {(p, q) | p in A 1 and q in A 2 } Set union: A 3 = {(p, q) | p in A 1 or q in A 2 } Set difference: A 3 = {(p, q) | p in A 1 and q not in A 2 } Symmetric difference: A 3 = {(p, q) | (p in A 1 and q not in A 2 ) or (p not in A 1 and q in A 2 ) }

149 Observation conclusion LFSA is closed under –set intersection –set union –set difference –symmetric difference The constructions used to prove these closure properties are essentially identical

150 Comments * You should be able to execute this algorithm –Convert two FSA’s into a third FSA with the correct properties. You should understand the idea behind this algorithm –The third FSA essentially runs both input FSA’s simultaneously on any input string –How we set A 3 depending on the specific set operation You should understand how this algorithm can be used to simplify design of FSA’s You should be able to construct new algorithms for new closure property proofs

151 Comparison * L 1 intersect L 2 L1L1 L2L2 LFSA M3M3 M1M1 M2M2 FSA’s LFSA REC FSA’s C++ Programs L L M P

152 Module 18 NFA’s –nondeterministic transition functions computations are trees, not paths –L(M) and LNFA LFSA subset of LNFA –Comparing computational models

153 Nondeterministic Finite State Automata NFA’s

154 Change:  is a relation For an FSA M,  (q,a) results in one and only one state for all states q and characters a. –That is,  is a function For an NFA M,  (q,a) can result in a set of states –That is,  is now a relation –Next step is not determined (nondeterministic)

155 Example NFA aaab a,b Why is this only an NFA and not an FSA? Identify as many reasons as you can.

156 Computing with NFA’s Configurations: same as they are for FSA’s Computations are different –Initial configuration is identical –However, there may be several next configurations or there may be none. Computation is no longer a “path” but is now a “graph” (often a tree) rooted at the initial configuration –Definition of halting, accepting, and rejecting configurations is identical –Definition of acceptance must be modified

157 Computation Graph (Tree) aaab a,b Input string aaaaba (1, aaaaba) (1, aaaba)(2, aaaba) (1, aaba)(2, aaba)(3, aaba)(1, aba)(2, aba)(3, aba)crash(1, ba)(2, ba)(3, ba) (1, a)(4, a) (1, )(2, )(5, )

159 Acceptance and Rejection * aaab a,b Input string aaaaba M accepts string x if one of the configurations reached is an accepting configuration (q 0, x) |- * (f, ),f in A M rejects string x if all configurations reached are either not halting configurations or are rejecting configurations (1, aaaaba) (1, aaaba)(2, aaaba) (1, aaba)(2, aaba)(3, aaba) (1, aba)(2, aba)(3, aba)crash (1, ba)(2, ba)(3, ba) (1, a)(4, a) (1, )(2, )(5, )

160 Comparison aaa a,b b b b b a FSA aaab a,b NFA

161 Defining L(M) and LNFA M accepts string x if one of the configurations reached is an accepting configuration –(q 0, x) |- * (f, ),f in A M rejects string x if all configurations reached are either not halting configurations or are rejecting configurations L(M) (or Y(M)) –The set of strings accepted by M N(M) –The set of strings rejected by M LNFA –Language L is in language class LNFA iff there exists an NFA M such that L(M) = L

162 Comparing language classes LFSA subset of LNFA

163 LFSA subset LNFA Let L be an arbitrary language in LFSA Let M be the FSA such that L(M) = L –M exists by definition of L in LFSA Construct an NFA M’ such that L(M’) = L Argue L(M’) = L There exists an NFA M’ such that L(M’) = L L is in LNFA –By definition of L in LNFA

164 Visualization LFSA LNFA FSA’s NFA’s L L M M’ Let L be an arbitrary language in LFSA Let M be an FSA such that L(M) = L M exists by definition of L in LFSA Construct NFA M’ from FSA M Argue L(M’) = L There exists an NFA M’ such that L(M’) = L L is in LNFA

165 Construction We need to make M into an NFA M’ such that L(M’) = L(M) How do we accomplish this?

166 Module 19 LNFA subset of LFSA –Theorem 4.1 on page 131 of Martin textbook –Compare with set closure proofs Main idea –A state in FSA represents a set of states in original NFA

167 LNFA subset LFSA Let L be an arbitrary language in LNFA Let M be the NFA such that L(M) = L –M exists by definition of L in LNFA Construct an FSA M’ such that L(M’) = L Argue L(M’) = L There exists an FSA M’ such that L(M’) = L L is in LFSA –By definition of L in LFSA

168 Visualization LNFA LFSA NFA’s FSA’s L L M M’ Let L be an arbitrary language in LNFA Let M be an NFA such that L(M) = L M exists by definition of L in LNFA Construct FSA M’ from NFA M Argue L(M’) = L There exists an FSA M’ such that L(M’) = L L is in LFSA

169 Construction Specification We need an algorithm which does the following –Input: NFA M –Output: FSA M’ such that L(M’) = L(M)

170 An NFA can be in several states after processing an input string x Difficulty * aaab a,b Input string aaaaba (1, aaaaba) (1, aaaba)(2, aaaba) (1, aaba)(2, aaba)(3, aaba) (1, aba)(2, aba)(3, aba)crash (1, ba)(2, ba)(3, ba) (1, a)(4, a) (1, )(2, )(5, )

171 All strings which end up in the set of states {1,2,3} are indistinguishable with respect to L(M) Observation * aaab a,b Input string aaaaba (1, aaaaba) (1, aaaba)(2, aaaba) (1, aaba)(2, aaba)(3, aaba) (1, aba)(2, aba)(3, aba)crash (1, ba)(2, ba)(3, ba) (1, a)(4, a) (1, )(2, )(5, )

172 Given an NFA M = (Q, ,q 0, ,A), the equivalent FSA M’ should have one state for each subset of Q Example –In this case there are 5 states in Q –There are 2 5 subsets of Q including {} and Q –The FSA M’ will have 2 5 states What strings end up in state {1,2,3} of M’? –The strings which end up in states 1, 2, and 3 of NFA M. –In this case, strings which do not contain aaba and end with aa such as aa, aaa, and aaaa. Idea aaab a,b Input string aaaaba

173 Idea Illustrated aaab a,b Input string aaaaba (1,aaaaba)({1}, aaaaba) (1, aaaba)(2, aaaba)({1,2}, aaaba) (1, aaba)(2, aaba)(3, aaba) ({1,2,3}, aaba) (1, aba)(2, aba)(3, aba)({1,2,3}, aba) (1, ba)(2, ba)(3, ba)({1,2,3}, ba) (1, a)(4, a) ({1,4}, a) ({1,2,5}, ) (1, )(2, )(5, )

174 Construction Input NFA M = (Q, , q 0, , A) Output FSA M’ = (Q’,  ’, q’,  ’, A’) –What is Q’? all subsets of Q including Q and {} In this case, Q’ = –What is  ’? We always make  ’ =  In this case,  ’ =  = {a,b} –What is q’? We always make q’ = {q 0 } In this case q’ = a,b aa NFA M 1 23

175 Construction Input NFA M = (Q, , q 0, , A) Output FSA M’ = (Q’,  ’, q’,  ’, A’) –What is A’? Suppose a string x ends up in states 1 and 2 of the NFA M above. –Is x accepted by M? –Should {1,2} be an accepting state in FSA M’? Suppose a string x ends up in states 1 and 2 and 3 of the NFA M above. –Is x accepted by M? –Should {1,2,3} be an accepting state in FSA M’? Suppose p = {q 1, q 2, …, q k } where q 1, q 2, …, q k are in Q p is in A’ iff at least one of the states q 1, q 2, …, q k is in A In this case, A’ = a,b aa NFA M 1 23

176 Construction Input NFA M = (Q, , q 0, , A) Output FSA M’ = (Q’,  ’, q’,  ’, A’) –What is  ’? If string x ends up in states 1 and 2 after being processed by the NFA above, where does string xa end up after being processed by the NFA above? Figuring out  ’(p,a) in general –Suppose p = {q 1, q 2, …, q k } where q 1, q 2, …, q k are in Q –Then  ’(p,a) =  (q 1,a) union  (q 2,a) union … union  (q k,a) »Similar to 2 FSA to 1 FSA construction –In this case »  ’({1,2},a) = a,b aa NFA M 1 23

177 Construction Summary Input NFA M = (Q, , q 0, , A) Output FSA M’ = (Q’,  ’, q’,  ’, A’) –Q’ = all subsets of Q including Q and {} In this case, Q’ = {{}, {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}} –  ’ =  In this case,  ’ =  = {a,b} –q’ ={q 0 } In this case, q’ = {1} –A’ Suppose p = {q 1, q 2, …, q k } where q 1, q 2, …, q k are in Q p is in A’ iff at least one of the states q 1, q 2, …, q k is in A –  ’ Suppose p = {q 1, q 2, …, q k } where q 1, q 2, …, q k are in Q Then  ’(p,a) =  (q 1,a) union  (q 2,a) union … union  (q k,a) a,b aa NFA M 1 23

178 Example Summary a,b aa NFA M 1 23 {1}{1,2}{1,2,3}{1,3} {}{2}{3}{2,3} a,b a a a a a b b b b b FSA M’

179 Example Summary Continued a,b aa NFA M 1 23 {1}{1,2}{1,2,3}{1,3} {}{2}{3}{2,3} a,b a a a a a b b b b b FSA M’ These states cannot be reached from initial state and are unnecessary.

180 Example Summary Continued a,b aa NFA M 1 23 {1}{1,2}{1,2,3}{1,3} a a a a b b b b Smaller FSA M’ By examination, we can see that state {1,3} is unnecessary. However, this is a case by case optimization. It is not a general technique or algorithm.

181 Example 2 a,b ab Step 1: name the three states of NFA M ABC NFA M

182 Step 2: transition table a,b ab ABC NFA M {A}{B}{} ab {B}{B,C}{B} {} {B,C}{B}{B,C}  ’({B,C},a) =  (B,a) U  (C,a) = {B} U {} = {B}  ’({B,C},b) =  (B,b) U  (C,b) = {B,C} U {} = {B,C}

183 Step 3: accepting states a,b ab ABC NFA M {A}{B}{} ab {B} {B,C} {} {B,C}{B}{B,C} Which states should be accepting? Any state which includes an accepting state of M, in this case, C. A’ = {{B,C}}

184 Step 4: Answer a,b ab ABC NFA M Initial state is {A} Set of final states A’ = {{B,C}} {A}{B}{} ab {B} {B,C} {} {B,C}{B}{B,C} This is sufficient. You do NOT need to turn this into a diagram.

185 Step 5: Optional a,b ab ABC NFA M a ab {A}{B}{B,C} b a a,b {} b FSA M’

186 Comments You should be able to execute this algorithm –You should be able to convert any NFA into an equivalent FSA. You should understand the idea behind this algorithm –For an FSA M’, strings which end up in the same state of M’ are indistinguishable wrt L(M’) –For an NFA M, strings which end up in the same set of states of M are indistinguishable wrt L(M)

187 Comments You should understand the importance of this algorithm –Design tool We can design using NFA’s A computer will convert this NFA into an equivalent FSA –FSA’s can be executed by computers whereas NFA’s cannot (or at least cannot easily be run by computers) –Chaining together algorithms Perhaps it is easy to build NFA’s to accept L 1 and L 2 Use this algorithm to turn these NFA’s to FSA’s Use previous algorithm to build FSA to accept L 1 intersect L 2 You should be able to construct new algorithms for new closure property proofs

188 Module 20 NFA’s with -transitions –NFA- ’s Formal definition Simplifies construction –LNFA- –Showing LNFA  is a subset of LNFA (extra credit) and therefore a subset of LFSA

189 Defining NFA- ’s

190 Change:  -transitions We now allow an NFA M to change state without reading input That is, we add the following categories of transitions to  –  (q  is allowed

191 Example * a,b a aba b aab

192 Defining L(M) and LNFA- M accepts string x if one of the configurations reached is an accepting configuration –(q 0, x) |- * (f, ),f e A M rejects string x if all configurations reached are either not halting configurations or are rejecting configurations L(M) or Y(M) N(M) LNFA- –Language L is in language class LNFA- iff

193 LNFA- subset LFSA Recap of what we already know –Let M be any NFA –There exists an algorithm A 1 which constructs an FSA M’ such that L(M’) = L(M) New goal –Let M be any NFA- –There exists an algorithm A 2 which constructs an FSA M’ such that L(M’) = L(M)

194 Visualization Goal –Let M be any NFA- –There exists an algorithm A 2 which constructs an FSA M’ such that L(M’) = L(M) NFA- M FSA M’ A2A2

195 Modified Goal Question –Can we use any existing algorithms to simplify the task of developing algorithm A 2 ? Yes, we can use algorithm A 1 which converts an NFA M 1 into an FSA M’ such that L(M’) = L(M 1 ) NFA- M FSA M’A2A2 NFA- M FSA M’ Algorithm A 2 NFA  M 1 A1A1 A 2’

196 New Goal (extra credit) Difficulty –NFA- M can make transitions on –How can the NFA M 1 simulate these -transitions? NFA- M NFA M 1 A 2’ a bb 234561

197 Basic Idea For each state q of M and each character  of , figure out which states are reachable from q taking any number of -transitions and exactly one transition on that character . In the NFA-  M 1, directly connect q to each of these states using an arc labeled with . NFA- M NFA M 1 A 2’ a bb 234561 234561 b b b

198 Process State 2 NFA- M NFA-  M 1 A 2’ a bb 234561 234561 b b b b b b

199 Process State 3 NFA- M NFA-  M 1 A 2’ a bb 234561 234561 b b b b b b a a b

200 Final Picture NFA- M NFA-  M 1 A 2’ a bb 234561 234561 b b b b b b a a b b b a a

201 Construction Input NFA- M = (Q, , q 0, , A) Output NFA M 1 = (Q 1,  1, q 1,  1, A 1 ) –What is Q 1 ? Q 1 = Q In this case, Q 1 = {1,2,3,4,5,6} –What is  1 ?  1 =  In this case,  1 =  = {a,b} –What is q 1 ? We always make q 1 = q 0 In this case q 1 = 1 a bb 234561

202 Construction Input NFA- M = (Q, , q 0, , A) Output NFA M 1 = (Q 1,  1, q 1,  1, A 1 ) –What is  1 ?  1 (q,a) = the set of states reachable from state q in M taking any number of -transitions and exactly one transition on the character a –More on this later In this case –  1 (1,a) = {} –  1 (1,b) = {3,4,5} –What is A 1 ? A 1 = A with one minor change –If an accepting state is reachable from q 0 using only  -transitions, then we make q 1 an element of A 1 In this case, using only -transitions, no accepting state is reachable from q 0, so A 1 = A a bb 234561

203 Computing  1 (q,a)  1 (q,a) = the set of states reachable from state q in M taking 0 or more -transitions and exactly one transition on the character a –Break this down into three steps First compute all states reachable from q using 0 or more -transitions –We call this set of states  (q) Next, compute all states reachable from any element of  (q) using the character a –We can denote these states as  (  (q),a) Finally, compute all states reachable from states in  (  (q),a) using 0 or more -transitions –We denote these states as  (  (  (q),a)) –This is the desired answer a bb 234561

204 Example  1 (1,b) = {3,4,5} –Compute  (1), all states reachable from state 1 using 0 or more - transitions  (1) = {1,2} –Compute  (  (q),b), all states reachable from any element  (1) of using the character b:  (  (q),b) =  ({1,2},b) =  (1,b) U  (2,b) = {} U {3} = {3} –Compute  (  (  (q),a)), all states reachable from states in  (  (q),a) using 0 or more -transitions  (  (  (q),a)) =  (3) = {3,4,5} a bb 234561

205 Comments For extra credit, you should be able to execute this algorithm –Convert any NFA- into an equivalent NFA. For extra credit, you should understand the idea behind this algorithm –Why the transition function is computed the way it is –Why A 1 may need to include q 1 in some cases You should understand the importance of this algorithm –Compiler role again –Use in combination with previous algorithm for converting any NFA into an equivalent FSA to create a new algorithm for converting any NFA- into an equivalent FSA

206 LNFA- = LFSA Implications –Let us primarily use the term LFSA to refer to this language class –Given a language L is in LFSA We know there exists an FSA M s.t. L(M) = L We know there exists an NFA M s.t. L(M) = L –To show a language L is in LFSA Show there exists an FSA M s.t. L(M) = L Show there exists an NFA- M s.t. L(M) = L

207 Module 21 Closure Properties for LFSA using NFA’s –From now on, when I say NFA, I mean any NFA including an NFA- unless I add a specific restriction –union (second proof) –concatenation –Kleene closure

208 LFSA closed under set union (again)

209 LFSA closed under set union Let L 1 and L 2 be arbitrary languages in LFSA Let M 1 and M 2 be NFA’s s.t. L(M 1 ) = L 1, L(M 2 ) = L 2 –M 1 and M 2 exist by definition of L 1 and L 2 in LFSA and the fact that every FSA is an NFA Construct NFA M 3 from NFA’s M 1 and M 2 Argue L(M 3 ) = L 1 union L 2 There exists NFA M 3 s.t. L(M 3 ) = L 1 union L 2 L 1 union L 2 is in LFSA

210 Visualization L 1 union L 2 L1L1 L2L2 LFSA M3M3 M1M1 M2M2 NFA’s Let L 1 and L 2 be arbitrary languages in LFSA Let M 1 and M 2 be NFA’s s.t. L(M 1 ) = L 1, L(M 2 ) = L 2 M 1 and M 2 exist by definition of L 1 and L 2 in LFSA and the fact that every FSA is an NFA Construct NFA M 3 from NFA’s M 1 and M 2 Argue L(M 3 ) = L 1 union L 2 There exists NFA M 3 s.t. L(M 3 ) = L 1 union L 2 L 1 union L 2 is in LFSA

211 Algorithm Specification Input –Two NFA’s M 1 and M 2 Output –NFA M 3 such that L(M 3 ) = ? NFA M 1 NFA M 2 NFA M 3 A

212 Use -transition NFA M 1 NFA M 2 NFA M 3 A a M1M1 a,b M2M2 a M3M3

213 General Case * NFA M 1 NFA M 2 NFA M 3 A M1M1 M2M2 M3M3

214 Construction * Input –NFA M 1 = (Q 1,  , q 1,  , A 1 ) –NFA M 2 = (Q 2,  , q 2,  , A 2 ) Output –NFA M 3 = (Q 3,  , q 3,  , A 3 ) –What is Q 3 ? Q 3 = –What is  3 ?  3 =  1 =  2 –What is q 3 ? q 3 = NFA M 1 NFA M 2 NFA M 3 A

215 Construction Input –NFA M 1 = (Q 1,  , q 1,  , A 1 ) –NFA M 2 = (Q 2,  , q 2,  , A 2 ) Output –NFA M 3 = (Q 3,  , q 3,  , A 3 ) –What is A 3 ? A 3 = –What is  3 ?  3 = NFA M 1 NFA M 2 NFA M 3 A

216 Comments You should be able to execute this algorithm You should understand the idea behind this algorithm You should understand how this algorithm can be used to simplify design You should be able to design new algorithms for new closure properties You should understand how this helps prove result that regular languages and LFSA are identical –In particular, you should understand how this is used to construct an NFA M from a regular expression r s.t. L(M) = L(r) –To be seen later

217 LFSA closed under set concatenation

218 LFSA closed under set concatenation Let L 1 and L 2 be arbitrary languages in LFSA Let M 1 and M 2 be NFA’s s.t. L(M 1 ) = L 1, L(M 2 ) = L 2 –M 1 and M 2 exist by definition of L 1 and L 2 in LFSA and the fact that every FSA is an NFA Construct NFA M 3 from NFA’s M 1 and M 2 Argue L(M 3 ) = L 1 concatenate L 2 There exists NFA M 3 s.t. L(M 3 ) = L 1 concatenate L 2 L 1 concatenate L 2 is in LFSA

219 Visualization L 1 concatenate L 2 L1L1 L2L2 LFSA M3M3 M1M1 M2M2 NFA’s Let L 1 and L 2 be arbitrary languages in LFSA Let M 1 and M 2 be NFA’s s.t. L(M 1 ) = L 1, L(M 2 ) = L 2 –M 1 and M 2 exist by definition of L 1 and L 2 in LFSA and the fact that every FSA is an NFA Construct NFA M 3 from NFA’s M 1 and M 2 Argue L(M 3 ) = L 1 concatenate L 2 There exists NFA M 3 s.t. L(M 3 ) = L 1 concatenate L 2 L 1 concatenate L 2 is in LFSA

220 Algorithm Specification Input –Two NFA’s M 1 and M 2 Output –NFA M 3 such that L(M 3 ) = NFA M 1 NFA M 2 NFA M 3 A

221 Use -transition NFA M 1 NFA M 2 NFA M 3 A a M1M1 a,b M2M2 a M3M3

222 General Case NFA M 1 NFA M 2 NFA M 3 A M1M1 M2M2 M3M3

223 Construction Input –NFA M 1 = (Q 1,  , q 1,  , A 1 ) –NFA M 2 = (Q 2,  , q 2,  , A 2 ) Output –NFA M 3 = (Q 3,  , q 3,  , A 3 ) –What is Q 3 ? Q 3 = –What is  3 ?  3 =  1 =  2 –What is q 3 ? q 3 = NFA M 1 NFA M 2 NFA M 3 A

224 Construction Input –NFA M 1 = (Q 1,  , q 1,  , A 1 ) –NFA M 2 = (Q 2,  , q 2,  , A 2 ) Output –NFA M 3 = (Q 3,  , q 3,  , A 3 ) –What is A 3 ? A 3 = –What is  3 ?  3 = NFA M 1 NFA M 2 NFA M 3 A

225 Comments You should be able to execute this algorithm You should understand the idea behind this algorithm You should understand how this algorithm can be used to simplify design You should be able to design new algorithms for new closure properties You should understand how this helps prove result that regular languages and LFSA are identical –In particular, you should understand how this is used to construct an NFA M from a regular expression r s.t. L(M) = L(r) –To be seen later

226 LFSA closed under Kleene Closure

227 LFSA closed under Kleene Closure Let L be arbitrary language in LFSA Let M 1 be an NFA s.t. L(M 1 ) = L –M 1 exists by definition of L 1 in LFSA and the fact that every FSA is an NFA Construct NFA M 2 from NFA M 1 Argue L(M 2 ) = L 1 * There exists NFA M 2 s.t. L(M 2 ) = L 1 * L 1 * is in LFSA

228 Visualization L1*L1* L1L1 LFSA NFA’s Let L be arbitrary language in LFSA Let M 1 be an NFA s.t. L(M 1 ) = L –M 1 exists by definition of L 1 in LFSA and the fact that every FSA is an NFA Construct NFA M 2 from NFA M 1 Argue L(M 2 ) = L 1 * There exists NFA M 2 s.t. L(M 2 ) = L 1 * L 1 * is in LFSA M1M1 M2M2

229 Algorithm Specification Input –NFA M 1 Output –NFA M 2 such that L(M 2 ) = NFA M 1 NFA M 2 A

230 Use -transition NFA M 1 NFA M 2 A a M1M1 a M2M2

231 General Case * NFA M 1 NFA M 2 A M1M1 M3M3

232 Construction Input –NFA M 1 = (Q 1,  , q 1,  , A 1 ) Output –NFA M 2 = (Q 2,  , q 2,  , A 2 ) –What is Q 2 ? –What is  2 ?  2 =  1 –What is q 2 ? q 2 = NFA M 1 NFA M 2 A

233 Construction Input –NFA M 1 = (Q 1,  , q 1,  , A 1 ) Output –NFA M 2 = (Q 2,  , q 2,  , A 2 ) –What is A 2 ? A 2 = –What is  2 ?  2 = NFA M 1 NFA M 2 A

234 Comments You should be able to execute this algorithm You should understand the idea behind this algorithm –Why do we need to make an extra state p? You should understand how this algorithm can be used to simplify design You should be able to design new algorithms for new closure properties You should understand how this helps prove result that regular languages and LFSA are identical –In particular, you should understand how this is used to construct an NFA M from a regular expression r s.t. L(M) = L(r) –To be seen later

235 Module 22 Regular languages are a subset of LFSA –algorithm for converting any regular expression into an equivalent NFA –Builds on existing algorithms described in previous lectures

236 Regular languages are a subset of LFSA

237 Reg. Lang. subset LFSA Let L be an arbitrary regular language Let R be the regular expression such that L(R) = L –R exists by definition of L is regular Construct an NFA- M such that L(M) = L –M is constructed from regular expression R Argue L(M) = L There exists an NFA- M such that L(M) = L L is in LFSA –By definition of L in LFSA and equivalence of LFSA and LNFA-

238 Visualization Regular Languages LFSA Regular Expressions NFA- ’s L L R M Let L be an arbitrary regular language Let R be the regular expression such that L(R) = L R exists by definition of L is regular Construct an NFA- M such that L(M) = L M is constructed from regular expression R Argue L(M) = L There exists an NFA- M such that L(M) = L L is in LFSA By definition of L in LFSA and equivalence of LFSA and LNFA-

239 Algorithm Specification Input –Regular expression R Output –NFA M such that L(M) = Regular expression R NFA- M A

240 Recursive Algorithm We have an inductive definition for regular languages and regular expressions Our algorithm for converting any regular expression into an equivalent NFA is recursive in nature –Base Case –Recursive or inductive Case

241 Base Case Regular expression R has zero operators –No concatenation, union, Kleene closure –For any alphabet , only |  | + 2 regular languages can be depicted by any regular expression with zero operators The empty language  The language { } The |  | languages consisting of one string {a} for all a in 

242 Table lookup Finite number of base cases means we can use table lookup to handle them a b 

243 Recursive Case Regular expression R has at least one operator –This means R is built up from smaller regular expressions using the union, Kleene closure, or concatenation operators –More specifically, there are 3 cases: R = R 1 +R 2 R = R 1 R 2 R = R 1 *

244 Recursive Calls The algorithm recursively calls itself to generate NFA’s M 1 and M 2 which accept L(R 1 ) and L(R 2 ) The algorithm applies the appropriate construction –union –concatenation –Kleene closure to NFA’s M 1 and M 2 to produce an NFA M such that L(M) = L(R) 1) R = R 1 + R 2 2) R = R 1 R 2 3) R = R 1 *

245 Pseudocode Algorithm _____________ RegExptoNFA(_____________) { regular expression R 1, R 2 ; NFA M 1, M 2 ; Modify R by removing unnecessary enclosing parentheses /* Base Case */ If R = a, return (NFA for {a}) /* include here */ If R = , return (NFA for {}) /* Recursive Case */ Find “last operator O” of regular expression R Identify regular expressions R 1 (and R 2 if necessary) M 1 = RegExptoNFA(R 1 ) M 2 = RegExptoNFA(R 2 ) /* if necessary */ return (OP(M1, M2)) /* OP is chosen based on O */ }

246 Example A: R = (b+a)a * Last operator is concatenation R 1 = (b+a) R 2 = a* Recursive call with R 1 = (b+a) B: R = (b+a) Extra parentheses stripped away Last operator is union R 1 = b R 2 = a Recursive call with R 1 = b

247 Example Continued C: R = b Base case NFA for {b} returned B: return to this invocation of procedure Recursive call where R = R 2 = a D: R = a Base case NFA for {a} returned B: return to this invocation of procedure return UNION(NFA for {b}, NFA for {a}) A: return to this invocation of procedure Recursive call where R = R 2 = a*

248 Example Finished E: R = a* Last operator is Kleene closure R 1 = a Recursive call where R = R 1 = a F: R = a Base case NFA for {a} returned E: return to this invocation of procedure return (KLEENE(NFA for {a})) A: return to this invocation of procedure return CONCAT(NFA for {b,a}, NFA for {a}*)

249 concatenate Pictoral View (b|a)a* (b|a)a* union baba b a Kleene Closure aa a a b a

250 Parse Tree We now present the “parse” tree for regular expression (b+a)a* concatenate unionKleene closure baa

251 Module 23 Regular languages review –Several ways to define regular languages –Two main types of proofs/algorithms Relative power of two computational models proofs/constructions Closure property proofs/constructions –Language class hierarchy Applications of regular languages

252 Defining regular languages

253 Three definitions LFSA –A language L is in LFSA iff there exists an FSA M s.t. L(M) = L LNFA –A language L is in LNFA iff there exists an NFA M s.t. L(M) = L Regular languages –A language L is regular iff there exists a regular expression R s.t. L(R) = L Conclusion –All these language classes are equivalent –Any language which can be represented using any one of these models can be represented using either of the other two models

254 Two types of proofs/constructions

255 Relative power proofs These proofs work between two language classes and two computational models The crux of these proofs are algorithms which behave as follows: –Input: One program from the first computational model –Output: A program from the second computational model that is equivalent in function to the first program

256 Closure property proofs These proofs work within a single language class and typically within a single computational model The crux of these proofs are algorithms which behave as follows: –Input: 1 or 2 programs from a given computational model –Output: A third program from the same computational model that accepts/describes a third language which is a combination of the languages accepted/described by the two input programs

257 Comparison L 1 intersect L 2 L1L1 L2L2 LFSA M3M3 M1M1 M2M2 FSA’s LNFA LFSA NFA’s FSA’s L L M M’

258 Language class hierarchy All languages over alphabet  RE REC regular H H ?

259 Three remaining topics Myhill-Nerode Theorem –Provides technique for proving a language is not regular –Also represents fundamental understanding of what a regular language is Decision problems about regular languages –Most are solvable in contrast to problems about recursive languages Pumping lemma –Provides technique for proving a language is not regular

260 Module 24 Myhill-Nerode Theorem –distinguishability –equivalence classes of strings –designing FSA’s –proving a language L is not regular

261 Distinguishability

262 Distinguishable and Indistinguishable String x is distinguishable from string y with respect to language L iff –there exists a string z such that xz is in L and yz is not in L OR xz is not in L and yz is in L String x is indistinguishable from string y with respect to language L iff –for all strings z, xz and yz are both in L OR xz and yz are both not in L

263 Example Let EVEN-ODD be the set of strings over {a,b} with an even number of a’s and an odd number of b’s –Is the string aa distinguishable from the string bb with respect to EVEN-ODD? –Is the string aa distinguishable from the string ab with respect to EVEN-ODD?

264 Equivalence classes of strings

265 Definition of equivalence classes Every language L partitions  * into equivalence classes via indistinguishability –Two strings x and y belong to the same equivalence class defined by L iff x and y are indistinguishable w.r.t L –Two strings x and y belong to different equivalence classes defined by L iff x and y are distinguishable w.r.t. L

266 Example How does EVEN-ODD partition {a,b}* into equivalence classes? Strings with an EVEN number of a’s and an EVEN number of b’s Strings with an EVEN number of a’s and an ODD number of b’s Strings with an ODD number of a’s and an EVEN number of b’s Strings with an ODD number of a’s and an ODD number of b’s

267 Second Example Let 1MOD3 be the set of strings over {a,b} whose length mod 3 = 1. How does 1MOD3 partition {a,b}* into equivalence classes? Length mod 3 = 0 Length mod 3 = 1 Length mod 3 = 2

268 Designing FSA’s

269 Designing an FSA for EVEN-ODD Even Odd Even b ab a a b b a

270 Designing an FSA for 1MOD3 Length mod 3 = 0 Length mod 3 = 1 Length mod 3 = 2 a aa a,b

271 Proving a language is not regular

272 Third Example Let EQUAL be the set of strings x over {a,b} s.t. the number of a’s in x = the number of b’s in x How does EQUAL partition {a,b}* into equivalence classes? How many equivalence classes are there? Can we construct a finite state automaton for EQUAL?

273 Myhill-Nerode Theorem

274 Theorem Statement Two part statement –If L is regular, then L partitions  * into a finite number of equivalence classes –If L partitions  * into a finite number of equivalence classes, then L is regular One part statement –L is regular iff L partitions  * into a finite number of equivalence classes

275 Implication 1 Method for constructing FSA’s to accept a language L –Identify equivalence classes defined by L –Make a state for each equivalence class –Identify initial and accepting states –Add transitions between the states You can use a canonical element of each equivalence class to help with building the transition function 

276 Implication 2 Method for proving a language L is not regular –Identify equivalence classes defined by L –Show there are an infinite number of such equivalence classes Table format may help, but it is only a way to help illustrate that there are an infinite number of equivalence classes defined by L

277 Proving a language is not regular revisited

278 Proving EQUAL is not regular Let EQUAL be the set of strings x over {a,b} s.t. the number of a’s in x = the number of b’s in x We want to show that EQUAL partitions {a,b}* into an infinite number of equivalence classes We will use a table that is somewhat reminiscent of the table used for diagonalization –Again, you must be able to identify the infinite number of equivalence classes being defined by the table. They ultimately represent the proof that EQUAL or whatever language you are working with is not regular.

279 Table * a aa aaa aaaa aaaaa... b IN OUT... bb OUT IN OUT... bbb OUT OUT 1 IN OUT... bbbb OUT IN OUT... bbbbb OUT IN... ……………………………… The strings being distinguished are the rows. The tables entries indicate that the concatenation of the row string with the column string is in or not in EQUAL. Each complete column shows one row string is distinguishable from all the other row strings.

280 Concluding EQUAL is nonregular * We have shown that EQUAL partitions {a,b}* into an infinite number of equivalence classes –In this case, we only identified some of the equivalence classes defined by EQUAL, but that is sufficient Thus, the Myhill-Nerode Theorem implies that EQUAL is nonregular

281 Summary Myhill-Nerode Theorem and what it says –It does not say a language L is regular iff L is finite Many regular languages such as  * are not finite –It says that a language L is regular iff L partitions  * into a finite number of equivalence classes Provides method for designing FSA’s Provides method for proving a language L is not regular –Show that L partitions  * into an infinite number of equivalence classes

282 Two/Three Types of Problems Create a table that helps prove that a specific language L is not regular –You get to choose the “row” and “column” strings –I choose the “row” strings Identify the equivalence classes defined by L as highlighted by a given table

283 Module 25 Decision problems about regular languages –Basic problems are solvable halting, accepting, and emptiness problems –Solvability of other problems answer-preserving input transformations to basic problems

284 Programs In this unit, our programs are the following three types of objects –FSA’s –NFA’s –regular expressions Previously, they were C ++ programs –Review those topics after mastering today’s examples

285 Basic Decision Problems (and algorithms for solving them)

286 Halting Problem Input –FSA M –Input string x to M Question –Does M halt on x? Give an algorithm for solving the FSA halting problem.

287 Accepting Problem Input –FSA M –Input string x to M Question –Is x in L(M)? Give an algorithm ACCEPT for solving the accepting problem.

288 Empty Language Problem Input –FSA M Question –Is L(M)={}? Give an algorithm for solving the empty language problem. –Don’t look ahead to the next slide.

289 Algorithms for solving empty language problem Algorithm 1 –View FSA M as a directed graph (nodes, arcs) –See if any accepting node is reachable from the start node Algorithm 2 –Let n be the number of states in FSA M –Run ACCEPT(M,x) for all input strings of length < n –If any are accepted THEN no ELSE yes Why is algorithm 2 correct?

290 Solving Other Problems (using answer-preserving input transformations)

291 Complement Empty Problem Input –FSA M Question –Is (L(M)) c = {}? Show how to use an answer-preserving input transformation to help solve this problem –Show that the Complement Empty problem transforms to the Empty Language problem –Don’t look at next two slides

292 Algorithm Description Convert input FSA M into an FSA M’ such that L(M’) = (L(M)) c –We do this by applying the algorithm which we used to show that LFSA is closed under complement Feed FSA M’ into algorithm which solves the empty language problem If that algorithm returns yes THEN yes ELSE no

293 Input Transformation Illustrated Algorithm for solving empty language problem FSA M Complement Construction FSA M’ Yes/No Algorithm for complement empty problem The complement construction algorithm is the answer-pres. input transformation. If M is a yes input instance of CE, then M’ is a yes input instance of EL. If M is a no input instance of CE, then M’ is a no input instance of EL.

294 NFA Empty Problem Input –NFA M Question –Is L(M)={}? Show how to use answer-preserving input transformations to help solve this problem –Show that the NFA Empty problem transforms to the Empty Language problem

295 Input Transformation Yes/No Algorithm for NFA empty problem

296 Equal Problem Input –FSA’s M 1 and M 2 Question –Is L(M 1 ) = L(M 2 )? Show how to use answer-preserving input transformations to solve this problem –Try and transform this problem to the empty language problem –If L(M 1 ) = L(M 2 ), then what combination of L(M 1 ) and L(M 2 ) must be empty?

297 Input Transformation Illustrated Yes/No Algorithm for Equal problem

298 Summary Decision problems with programs as inputs Basic problems –You need to develop algorithms from scratch based on properties of FSA’s Solving new problems –You need to figure out how to combine the various algorithms we have seen in this unit to solve the given problem

299 Module 26 Pumping Lemma –A technique for proving a language L is NOT regular –What does the Pumping Lemma mean? –Proof of Pumping Lemma

300 Pumping Lemma How do we use it?

301 Pumping Condition A language L satisfies the pumping condition if: –there exists an integer n > 0 such that –for all strings x in L of length at least n –there exist strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 and For all k >= 0, uv k w is in L

302 Pumping Lemma All regular languages satisfy the pumping condition All languages over {a,b} Regular languages “Pumping Languages”

303 Implications We can use the pumping lemma to prove a language L is not regular –How? We cannot use the pumping lemma to prove a language is regular –How might we try to use the pumping lemma to prove that a language L is regular and why does it fail? Regular Pumping

304 Pumping Lemma What does it mean?

305 Pumping Condition A language L satisfies the pumping condition if: –there exists an integer n > 0 such that –for all strings x in L of length at least n –there exist strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 and For all k >= 0, uv k w is in L

306 v can be pumped Let x = abcdefg be in L Then there exists a substring v in x such that v can be repeated (pumped) in place any number of times and the resulting string is still in L –uv k w is in L for all k >= 0 For example –v = cde uv 0 w = uw = abfg is in L uv 1 w = uvw = abcdefg is in L uv 2 w = uvvw = abcdecdefg is in L uv 3 w = uvvvw = abcdecdecdefg is in L … 1) x in L 2) x = uvw 3) For all k >= 0, uv k w is in L

307 What the other parts mean A language L satisfies the pumping condition if: –there exists an integer n > 0 such that defer what n is till later –for all strings x in L of length at least n x must be in L and have sufficient length –there exist strings u, v, w such that x = uvw and |uv| <= n and –v occurs in the first n characters of x |v| >= 1 and –v is not For all k >= 0, uv k w is in L

308 Examples Example 1 –Let L be the set of even length strings over {a,b} –Let x = abaa –Let n = 2 –What are the possibilities for v? abaa, abaa abaa –Which one satisfies the pumping lemma?

309 Examples * Example 2 –Let L be the set of strings over {a,b} where the number of a’s mod 3 is 1 –Let x = abbaaa –Let n = 3 –What are the possibilities for v? abbaaa, abbaaa, abbaaa abbaaa, abbaaa abbaaa –Which ones satisfy the pumping lemma?

310 Pumping Lemma Proof

311 High Level Outline Let L be an arbitrary regular language Let M be an FSA such that L(M) = L –M exists by definition of LFSA and the fact that regular languages and LFSA are identical Show that L satisfies the pumping condition –Use M in this part Pumping Lemma follows

312 First step: n+1 prefixes of x Let n be the number of states in M Let x be an arbitrary string in L of length at least n –Let x i denote the ith character of string x There are at least n+1 distinct prefixes of x –length 0: –length 1: x 1 –length 2: x 1 x 2 –... –length i: x 1 x 2 … x i –... –length n: x 1 x 2 … x i … x n

313 Example Let n = 8 Let x = abcdefgh There are 9 distinct prefixes of x –length 0: –length 1: a –length 2: ab –... –length 8: abcdefgh

314 Second step: Pigeon-hole Principle As M processes string x, it processes each prefix of x –In particular, each prefix of x must end up in some state of M Situation –There are n+1 distinct prefixes of x –There are only n states in M Conclusion –At least two prefixes of x must end up in the same state of M Pigeon-hole principle –Name these two prefixes p 1 and p 2

315 Third step: Forming u, v, w Setting: –Prefix p 1 has length i –Prefix p 2 has length j > i prefix p 1 of length i: x 1 x 2 … x i prefix p 2 of length j: x 1 x 2 … x i x i+1 … x j Forming u, v, w –Set u = p 1 = x 1 x 2 … x i –Set v = x i+1 … x j –Set w = x j+1 … x |x| –x 1 x 2 … x i x i+1 … x j x j+1 … x |x| u v w

316 Example 1 Let M be a 5-state FSA that accepts all strings over {a,b,c,…,z} whose length mod 5 = 3 Consider x = abcdefghijklmnopqr, a string in L What are the two prefixes p 1 and p 2 ? What are u, v, w? 012 3 4

317 Example 2 Let M be a 3-state FSA that accepts all strings over {0,1} whose binary value mod 3 = 1 Consider x = 10011, a string in L What are the two prefixes p 1 and p 2 ? What are u, v, w? 0 1 2 1 0 0 1

318 Fourth step: Showing u, v, w satisfy all the conditions |uv| <= n –uv = p 2 –p 2 is one of the first n+1 prefixes of string x |v| >= 1 –v consists of the characters in p 2 after p 1 –Since p 2 and p 1 are distinct prefixes of x, v is not For all k >= 0, uv k w in L –u=p 1 and uv=p 2 end up in the same state q of M This is how we defined p 1 and p 2 –Thus for all k >= 0, uv k ends up in state q –The string w causes M to go from state q to an accepting state

319 Example 1 again Let M be a 5-state FSA that accepts all strings over {a,b,c,…,z} whose length mod 5 = 3 Consider x = abcdefghijklmnopqr, a string in L What are u, v, w? –u = –v = abcde –w = fghijklmnopqr |uv| = 5 <= 5 |v| = 5 >= 1 For all t>=0, (abcde) t fghijklmnopqr is in L 012 3 4

320 Example 2 again Let M be a 3-state FSA that accepts all strings over {0,1} whose binary interpretation mod 3 = 1 Consider x = 10011, a string in L What are u, v, w? –u = 1 –v = 00 –w = 11 |uv| = 3 <= 3 |v| = 2 >= 1 For all k>=0, 1(00) k 11 is in L 0 1 2 1 0 0 1

321 Pumping Lemma A language L satisfies the pumping condition if: –there exists an integer n > 0 such that –for all strings x in L of length at least n –there exist strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 and For all k >= 0, uv k w is in L Pumping Lemma: All regular languages satisfy the pumping condition

322 Module 27 Applications of Pumping Lemma –General proof template What is the same in every proof What changes in every proof –Incorrect pumping lemma proofs –Some rules of thumb

323 Pumping Lemma Applying it to prove a specific language L is not regular

324 How we use the Pumping Lemma We choose a specific language L –For example, {a j b j | j > 0} We show that L does not satisfy the pumping condition We conclude that L is not regular

325 Showing L “does not pump” A language L satisfies the pumping condition if: –there exists an integer n > 0 such that –for all strings x in L of length at least n –there exist strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 and For all k >= 0, uv k w is in L A language L does not satisfy the pumping condition if: –for all integers n of sufficient size –there exists a string x in L of length at least n such that –for all strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 –There exists a k >= 0 such that uv k w is not in L

326 Example Proof A language L does not satisfy the pumping condition if: –for all integers n of sufficient size –there exists a string x in L of length at least n such that –for all strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 –There exists a k >= 0 such that uv k w is not in L Proof that L = {a i b i | i>0} does not satisfy the pumping condition Let n be the integer from the pumping lemma Choose x = a n b n Consider all strings u, v, w s.t. x = uvw and |uv| <= n and |v| >= 1 Argue that uv k w is not in L for some k >= 0 –Argument must apply to all possible u,v,w –Continued on next slide

327 Example Proof Continued Proof that L = {a i b i | i>0} does not satisfy the pumping condition Let n be the integer from the pumping lemma Choose x = a n b n Consider all strings u, v, w s.t. x = uvw and |uv| <= n and |v| >= 1 Argue that uv k w is not in L for some k >= 0 –Argument must apply to all possible u,v,w –Continued on right uv 0 w = uw is not in L –uv contains only a’s why? –uw = a n-|v| b n Follows from previous line and uvw = x = a n b n –uw contains fewer a’s than b’s why? –Therefore, uw is not in L Therefore L does not satisfy the pumping condition

328 Alternate choice of k Proof that L = {a i b i | i>0} does not satisfy the pumping condition Let n be the integer from the pumping lemma Choose x = a n b n Consider all strings u, v, w s.t. x = uvw and |uv| <= n and |v| >= 1 Argue that uv k w is not in L for some k >= 0 –Argument must apply to all possible u,v,w –Continued on right uv 2 w = uvvw is not in L –uv contains only a’s why? –uvvw = a n+|v| b n follows from previous line and uvw = x = a n b n –uvvw contains more a’s than b’s why? –Therefore, uvvw is not in L Therefore L does not satisfy the pumping condition

329 Pumping Lemma Some bad applications of the pumping lemma

330 Bad Pumping Lemma Applications We now look at some examples of bad applications of the pumping lemma We work with the language EQUAL consisting of the set of strings over {a,b} such that the number of a’s equals the number of b’s We focus first on bad choices of string x We then consider another flawed technique

331 First bad choice of x A language L does not satisfy the pumping condition if: –Let n be the integer from the pumping lemma –there exists a string x in L of length at least n such that –for all strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 –There exists a k >= 0 such that uv k w is not in L Let n be the integer from the pumping lemma Choose x = a 10 b 10 –What is wrong with this choice of x?

332 Second bad choice of x A language L does not satisfy the pumping condition if: –Let n be the integer from the pumping lemma –there exists a string x in L of length at least n such that –for all strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 –There exists a k >= 0 such that uv k w is not in L Let n be the integer from the pumping lemma Choose x = a n b 2n –What is wrong with this choice of x?

333 Third bad choice of x A language L does not satisfy the pumping condition if: –Let n be the integer from the pumping lemma –there exists a string x in L of length at least n such that –for all strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 –There exists a k >= 0 such that uv k w is not in L Let n be the integer from the pumping lemma Choose x = (ab) n –What is wrong with this choice of x? The problem is there is a choice of u, v, w satisfying the three conditions such that for all k >=0, uv k w is in L What is an example of such a u, v, w?

334 Find the flaw in this proof A language L does not satisfy the pumping condition if: –Let n be the integer from the pumping lemma –there exists a string x in L of length at least n such that –for all strings u, v, w such that x = uvw and |uv| <= n and |v| >= 1 –There exists a k >= 0 such that uv k w is not in L Let n be the integer from the pumping lemma Choose x = a n b n Let u = a 2, v =a, w = a n-3 b n –|uv| = 3 <= n –|v| = 1 Choose k = 2 Argue uv 2 w is not in EQUAL –uv 2 w = uvvw = a 2 aaa n-3 b n = a n+1 b n –There is one more a than b in uv 2 w –Thus uv 2 w is not in L

335 Pumping Lemma Two rules of thumb

336 Two Rules of Thumb * Try to make the first n characters of x identical –For EQUAL, choose x = a n b n rather than (ab) n Simplifies case analysis as v only contains a’s Try k=0 or k=2 –k=0 This reduces number of occurrences of that first character –k=2 This increases number of occurrences of that first character

337 Summary We use the Pumping Lemma to prove a language is not regular –Note, does not work for all nonregular languages, though Choosing a good string x is first key step Choosing a good integer k is second key step Must apply argument to all legal u, v, w

1 Module 11 Proving more specific problems are not solvable Input transformation technique –Use subroutine theme to show that if one problem is unsolvable,

Similar presentations

Presentation on theme: "1 Module 11 Proving more specific problems are not solvable Input transformation technique –Use subroutine theme to show that if one problem is unsolvable,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Module 11 Proving more specific problems are not solvable Input transformation technique –Use subroutine theme to show that if one problem is unsolvable,

Similar presentations

Presentation on theme: "1 Module 11 Proving more specific problems are not solvable Input transformation technique –Use subroutine theme to show that if one problem is unsolvable,"— Presentation transcript:

Similar presentations

About project

Feedback