CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook,

CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook, An Introduction to Formal Languages and Automata, 3 rd ed., by Peter Linz, published by Jones and Bartlett Publishers, Inc., Sudbury, MA, 2001. They are intended for classroom use only and are not a substitute for reading the textbook.

Turing’s Thesis: Any problem that can be solved by algorithmic means – by some process that can be described as a sequence of discrete steps – can be solved by a Turing machine.

What computers can’t do We are talking about what computers (or algorithms) can’t do. The Church-Turing thesis says that a TM can do anything that any computer can do. So talking about what computers can’t do is the same as talking about what TMs can’t do. We will see that there are some languages that can’t be accepted by any TM, and some functions that can’t be computed by any TM, or equivalently, by any algorithm.

Review exercise Show that the set of all TMs is countably infinite. Show that the set of languages is uncountably infinite. Show that that set of numeric functions (which take a natural number as input and produce a natural number as output) is uncountably infinite. From this, it follows that there are languages that are not decidable and numeric functions that cannot be computed by any TM. But this is a general argument. It would be more interesting if we give specific examples. Here’s the first...

Here’s one kind problem TMs can’t solve - decision problems represented by non-recursive languages. Remember that a decision problem is a question for which there is a yes/no answer. The TM answers “yes” by printing out a “1” on the tape, or “no” by printing out a “0”. For instance, “Given a specific string x  {a, b}, is x in the language pal?” A TM can solve this decision problem; that is, for every string x, the TM in your textbook can decide whether x belongs to pal or not. For example, given the input abbaba, that TM will respond by printing out a 0 on its tape. The string abbaba is called an instance of the problem.

Sometimes the instances which are input to a TM have to be encoded. In this case, the first thing the TM has to do is to decide if the input really represents an instance of the problem or not. If it does, then it decides if the instance is a yes-instance or a no-instance. We can say that a Turing machine T solves a problem P by recognizing the language of encoded yes- instances of P. This requires that the TM stop while processing either type of string – encoded yes-instances (for which it will write a 1 on its tape and halt) and encoded no-instances (for which it will write a 0 on its tape and crash).

Unfortunately, we already know that only recursive languages are guaranteed to be able to processed by TMs that always stop. TMs that have to process recursively enumerable languages sometimes have to reject strings by looping infinitely. This is not permissible in a TM designed to solve a decision problem.

To make this more concrete, let’s look, again, at an example of a non-recursive language – the language SA. SA stands for Self-Accepting. It is the language of all TMs that accept their own encoded representation as a legitimate string in the language they recognize. Technically, SA is defined as: SA = {w  {0, 1)* | w = e(T) for some TM T, and T accepts w} You can think of it this way: SA is the language of encoded yes-instances for the decision problem called Self-Accepting: Given Turing machine T, does it accept its own encoding e(T)?

We are looking for is an algorithm that gives the right answer to every version of the problem. What we need is a universal TM that takes as its input the encoded representation of a TM (which it then simulates within itself) and another encoded representation of the TM (which it takes as the input string for the simulated TM). The UTM then runs the string on the simulated TM. If the simulated TM halts in an accepting state, then the UTM prints out a 1 on its own tape and stops. If the simulated TM crashes, then the UTM writes out a 0 on its tape and stops.

Suppose, now that: L(T) is an NRREL (non-recursive recursively enumerable language) e(T) is a string in an NRREL, but is not a string in L(T) Then, maybe T will crash to reject e(T), or maybe T will go into an infinite loop – we don’t know. And what happens to the UTM simulating all of this?

If T goes into an infinite loop, the UTM will wait forever for an answer. It will never print a 0 on its tape. (But it won’t print a 1, either.)

In order to give an answer to a decision problem, the UTM has to stop to accept and stop to reject. But, given some TMs, the UTM might not stop. That means that the UTM can’t solve this decision problem, and if a UTM can’t solve it, nothing can. The SA problem is unsolvable.

Another example of an unsolvable problem is the Halting Problem: Given a TM T and a string w, is w  L(T)? This is often thought of in this way: Given a TM T and a string w, will T halt while processing w?

Technically: The halting problem is the problem of deciding, for any TM and input, whether the TM halts on that input. Equivalently, it is the problem of writing an algorithm HALT(M, w) that takes as arguments the description of a TM, M, and its input, w, and returns 1 if M halts on w, and 0 otherwise.

This is basically just a variation of the SA problem. Instead of a TM T having to decide if its own encoded representation is a string in L(T), we simply give an arbitrary string w to T, and ask it to decide if the string is in L(T). Again, we construct a UTM, and pass it e(T) and w. The UTM simulates T processing string w. If T halts (accepting w), then the UTM halts and prints a 1. If T crashes (rejecting w), then the UTM stops and prints a 0. But if T loops forever to reject w, then the UTM waits forever and never prints anything on its tape.

This is referred to as the Halting Problem because we can change the problem slightly, like this: Construct a UTM, and pass it e(T) and w. The UTM simulates T processing string w. If T halts (accepting w), then the UTM halts and prints a 1. If T crashes or loops forever (rejecting w), then the UTM stops and prints a 0. But this is impossible: we know that if T loops forever to reject w, then the UTM will wait forever and never print anything on its tape.

Suppose our TM, T, is given a series of strings to process, and the first string is string w. Unfortunately, w is not  L(T), and T starts off on an infinite loop to reject it. When should we decide that the TM is going to reject the string? After step 100? 1K? 1M? Obviously, we don’t know how long we have to wait. We could be sitting there forever, thinking, “Maybe the TM will recognize this string on the very next move...”

So this TM will never halt while processing w - but the UTM can’t predict that fact in advance. The poor UTM is sitting there, waiting for T to finish processing w, so it can go on to the second string. T is never going to stop, because it is in an infinite loop, but at any given time the UTM can’t distinguish that situation from one in which T is going to accept w on the very next move. All it can do is wait, and wait, and wait. It is never going to print a 1, but it doesn’t know that, so it’s never going to print a 0, either. Obviously, this is another unsolvable problem.

We say that the halting problem is semi- decidable. We can use a universal TM to simulate any TM on any input. This can tell us whether the TM halts on that input, but not whether it doesn’t halt – because how do we know when we’ve waited long enough for it to halt? So the halting problem is semi-decidable (or equivalently, Turing enumerable).

The halting problem is semi-decidable, but it is not decidable. The proof that the halting problem is not decidable is by contradiction, using a diagonal argument.

First, we assume it is decidable. This means there must be a TM that takes as input any TM M and string w and answers yes or no, depending on whether M halts when given w as input. Let’s call the function computed by this TM, HALT(M,w), where the parameter M denotes an arbitrary TM and the parameter w denotes an arbitrary input to M. (These are the same parameters as for the universal TM.)

Next, we see that if there is a TM that computes HALT(M,w), it computes HALT(M,M). So, we can use it to construct a TM that computes the function DIAG(M), defined as follows: “yes” if HALT(M,M) returns “no” DIAG(M) = no output (loops forever) if HALT(M,M) returns “yes”

Now let M D denote the TM that computes DIAG. Now run this TM on itself. What is the output of DIAG(M D )? It halts (with output “yes”) only if it doesn’t halt, and it doesn’t halt only if it halts. This contradiction disproves the assumption that the halting problem is decidable.

T 1 accept don’t accept don’t T 2 don’t accept accept don’t T 3 accept don’t don’t accept... e(T 1 ) e(T 2 ) e(T 3 ) e(T 4 ) ….. Each row corresponds to a TM. Each column corresponds to a possible input – a TM encoded as an integer e(T i ). Each cell indicates whether the TM accepts this input. Consider the language that differs from each language in this list along the diagonal. This language corresponds to the halting problem. It is not accepted by any TM in the list of all TMs.

Let’s go over that again: The proof that the halting problem is undecidable is by contradiction. We assume there is a TM that can decide, for any TM, M, and input, w, whether M halts on input w. We use it to construct another TM, M’, that does the following: its input x is a TM encoded as a string (or by a number) if the TM encoded by x halts on input x then M’ loops forever, otherwise M’ halts What if the TM encoded by x is M’? The contradiction is that M’ halts on input x if and only if M’ doesn’t halt on input x. This disproves the assumption that there is a TM that solves the halting problem.

While the halting problem is not decidable but is semi-decidable, there are problems that are not even semi-decidable. The complement of the halting problem is the problem of determining whether an arbitrary TM enters an infinite loop when given an arbitrary input. This problem is not even semi-decidable.

In order for a language to be semi-decidable, the TM must halt after a finite time for every string in the language. For this TM, the TM must halt and display a 1 if the TM has entered an infinite loop. You cannot detect that a TM has entered an infinite loop unless you wait forever. Only after an infinite wait you can verify that the loop was indeed infinite. So the TM cannot halt after a finite time for any string in the language.

Other undecidable problems Once we have shown that the halting problem is undecidable, we can show that a large class of other problems about the input/output behavior of programs are undecidable. In fact, we can show that any nontrivial property of the input/output behavior of programs is undecidable.

Examples of undecidable problems About Turing machines: – –Is the language accepted by a TM empty, finite, regular, or context-free? – –Does a TM meet its “specification,” that is, does it have any “bugs.” – –and many others

Unsolvable problems involving CFGs You might think that context-free grammars are too simple to have unsolvable problems associated with them, but there are some. The Membership problem for CFLs is solvable, but: The question, “Are two context-free grammars equivalent” is unsolvable. The problem IsAmbiguous is unsolvable. The problem CFGNonemptyIntersection is unsolvable. The problem CFGGeneratesAll is unsolvable.

Not so surprising Although this result is sweeping in scope, maybe it is not too surprising. If a simple question such as whether a program halts or not is undecidable, why should we expect that any other property of the input/output behavior of programs is decidable? Rice’s theorem (page 311) makes it clear that failure to decide halting implies failure to decide any other interesting question about the input/output behavior of programs.

Problem reduction Before we consider Rice’s theorem, we need to understand the concept of problem reduction on which its proof is based. Reducing problem B to problem A means finding a way to convert problem B to problem A, so that a solution to problem A can be used to solve problem B. Why is this important? A reduction of problem B to problem A shows that problem A is at least as difficult to solve as problem B.

Using problem reduction to prove undecidability To show that a problem A is undecidable, we show that another problem B that we already know is undecidable can be reduced to A. Having proved that the halting problem is undecidable, we use problem reduction to show that other problems are undecidable.

Example: Totality problem: Decide whether an arbitrary TM halts on all inputs. (If it does, it computes a “total function.”) This is equivalent to the problem of whether a program can ever enter an infinite loop, for any input. It differs from the halting problem, which asks whether it enters an infinite loop for a particular input.

Proof that the totality problem is undecidable We prove that the halting problem is reducible to the totality problem. That is, if an algorithm can solve the totality problem, it can be used to solve the halting problem. Since no algorithm can solve the halting problem, the totality problem must also be unsolvable.

Proof that the totality problem is undecidable The reduction is as follows. For any TM, M, and input, w, we create another TM, M’, that takes an arbitrary input, ignores it, and runs M on w. Note that M’ halts on all inputs if and only if M halts on input w. Therefore, an algorithm that tells us whether M’ halts on all inputs also tells us whether M halts on input w, which would be a solution to the halting problem.

Practical implications The fact that the totality problem is undecidable means that we cannot write a program that can find any infinite loop in any program.

Example: Equivalence problem: Decide whether two TMs accept the same language. This is equivalent to the problem of whether two programs compute the same output for every input.

Proof that the equivalence problem is undecidable We prove that the totality problem is reducible to the equivalence problem. That is, if an algorithm can solve the equivalence problem, it can be used to solve the totality problem. Since no algorithm can solve the totality problem, the equivalence problem must also be unsolvable.

Proof that the equivalence problem is undecidable The reduction is as follows: For any TM, M, we can construct a TM, M’, that takes any input, w, runs M on that input, and outputs “yes” if M halts on w. We can also construct a TM M’’ that takes any input and simply outputs “yes.” If an algorithm can tell us whether M’ and M’’ are equivalent, it can also tell us whether M’ halts on all inputs, which would be a solution to the totality problem.

Practical implications The fact that the equivalence problem is undecidable means that the code optimization phase of a compiler may improve a program, but can never guarantee finding the optimally efficient version of the program. There may be potentially improved versions of the program that it cannot even be sure are equivalent.

A practical application to this unsolvable problem situation: Suppose you have a C++ program, and you want to check to see whether it has any infinite loops in it before delivering it to a customer. If we have a TM that could solve the halting problem, we could pass it the program, and it would halt in a finite number of steps to tell us, “Yes, there is an infinite loop.” Unfortunately, no TM can detect infinite loops in C++ code. It is an unsolvable problem.

There are lots of unsolvable problems: The Accepts problem is unsolvable. The Accepts problem is defined as: Given the string w, is w  L(T)? Obviously, for some NRRELs, T is going to reject some string w by looping forever, thus being unable to decide (in a finite number of steps) whether w is a string in L(T) or not. Even the Accepts( ) problem is unsolvable. Some TMs may require an infinite loop to reject.

Others? AcceptsEverything is unsolvable. AcceptsNothing is unsolvable. AcceptsSomething is unsolvable. WritesSomething is unsolvable. Subset is unsolvable. These are all reducible to H, the Halting Problem. For example, we say that H  Accepts( ).

Mathematical Example: Busy Beaver Function Let b stand for the Busy Beaver function, which has the natural numbers as its domain and range. We define b as follows: f(0) is 0; for n > 0, b(n) is obtained by considering TMs having n nonhalting states and tape alphabet {0, 1}. These TMs can be assumed to have states {q 0, q 1, …, q n }, and so there are only finitely many such machines. Some will halt on input 1 n, and we let b(n) be the maximum number of 1’s that remain on the tape when any of these machines halts. The number of 1’s is a measure of how busy a TM of this type can be before it halts. This function is not computable.

Busy Beaver Function A TM that computes b will receive an input of n 1’s, and will output some number of 1’s and then halt. Obviously, the TM could generate an infinite number of 1’s if we didn’t require that it halt at some point. But since it must halt, there must be some limit on the number of 1’s that it can produce as output. There will be many TMs with n+1 states that can produce strings of 1’s as output. The TM that produces the most 1’s is computing the Busy Beaver function. For example, a TM with 3 states that receives an input of 4 1’s might produce 5000 1’s before it halts. If no other TM with 3 states can produce as many 1’s before halting, this TM will compute b(4) - the 4th Busy Beaver number.

Busy Beaver Function The more 1’s there are in the input string, the more the TM will have to “work with”, so to speak. So the Busy Beaver function b(5) may be much larger than b(4), etc. Unfortunately, a TM which might represent a Busy Beaver function may not halt. So, for example, if there is a TM with 4 states which seems about to break the record for the Busy Beaver b(4) number, we can’t tell whether it is in an infinite loop or just taking a very long time to write the next 1 on the tape. It can be shown via proof by contradiction that this function is not computable.

Properties of programs We now describe a more general way of showing that a problem is undecidable, a result called Rice’s theorem. First we introduce some definitions. A property of a program (TM) can be viewed as the set of programs that have that property. A functional (or non-trivial) property of a program (TM) is one that some programs have and some don’t.

Rice’s Theorem If P is a property of languages that is satisfied by some but not all recursively enumerable languages, then the decision problem: Given a TM T, does L(T) have the property P? is unsolvable. (See your textbook for a formal proof.)

Rice’s theorem “Any functional property of programs is undecidable.” A functional property is: – –a property of the input/output behavior of the program; that is, it describes the mathematical function the program computes – –nontrivial, in the sense that it is a property of some programs but not all programs

Examples of functional properties The language accepted by a TM contains at least two strings. The language accepted by a TM is empty (contains no strings) The language accepted by a TM contains two different strings of the same length.

Rice’s theorem continued The proof generalizes the reasoning involved in reducing the halting problem to other problems. Rice’s theorem can be used to show that whether the language accepted by a Turing machine is context-free, regular, or even finite, are undecidable problems.

Not all properties of programs are functional Some properties of programs are decidable because they are not about the function the program computes, but instead, are about some details of the program itself Examples: – –the program contains the transition ((q,0),(p,1)) – –starting on the empty tape, the program P reaches state q in at most five steps

Conclusion There are definite limits on algorithmic computation

CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook,

Similar presentations

Presentation on theme: "CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook,

Similar presentations

Presentation on theme: "CS 3813: Introduction to Formal Languages and Automata Chapter 12 Limits of Algorithmic Computation These class notes are based on material from our textbook,"— Presentation transcript:

Similar presentations

About project

Feedback