NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision.

NP-complete and NP-hard problems

Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision problems we are trying to decide whether a statement is true or false. In optimization problems we are trying to find the solution with the best possible score according to some scoring scheme. Optimization problems can be either maximization problems, where we are trying to maximize a certain score, or minimization problems, where we are trying to minimize a cost function.

Example 1: Hamiltonian cycles Given a directed graph, we want to decide whether or not there is a Hamiltonian cycle in this graph. This is a decision problem.

Example 2: TSP - The Traveling Salesman Problem Given a complete graph and an assignment of weights to the edges, find a Hamiltonian cycle of minimum weight. This is the optimization version of the problem. In the decision version, we are given a weighted complete graph and a real number c, and we want to know whether or not there exists a Hamiltonian cycle whose combined weight of edges does not exceed c.

Example 3: Pairwise sequence alignment, optimization version Given two sequences and a scoring scheme find the highest-scoring (global or local) alignment of these sequences. This is an optimization problem. Similarly for multiple sequence alignment.

Example 3: Pairwise sequence alignment, decision version Given two sequences, a scoring scheme, and a cutoff score c, decide whether there exists an alignment with a score that is higher than c. Similarly for multiple sequence alignment.

Important Observation: Each optimization problem has a corresponding decision problem.

Inputs Each of the problems discussed above has its characteristic input. For example, for the optimization version of the TSP, the input consists of a weighted complete graph; for the decision version of the TSP, the input consists of a weighted complete graph and a real number. The input data of a problem are often called the instance of the problem. Each instance has a characteristic size; which is the amount of computer memory needed to describe the instance. If the instance is a graph of n vertices, then the size of this instance would typically be about n(n-1)/2.

The class P A decision problem D is solvable in polynomial time or in the class P, if there exists an algorithm A such that zA takes instances of D as inputs. zA always outputs the correct answer “Yes” or “No”. zThere exists a polynomial p such that the execution of A on inputs of size n always terminates in p(n) or fewer steps.

The class P EXAMPLE: The local and global pairwise sequence alignment problems are in the class P for all scoring schemes used in practice. The class P is often considered as synonymous with the class of computationally feasible problems, although in practice this is somewhat unrealistic.

Witnesses for decision problems Note that if the answer to a decision problem is “yes”, then there is usually some “witness” to this. For example, in the Hamiltonian cycle problem, any permutation v 1, v 2, …,v n of the vertices of the input graph is a potential witness. This potential witness is a true witness if v 1 is adjacent to v 2, … and v n is adjacent to v 1.

Witnesses for decision problems In the TSP, a potential witness would be a Hamiltonian cycle. This potential witness is a true witness if its cost is c or less.

Example4: factorization Given an integer N and an integer M with 1 ≤ M ≤ N, does N have a factor d with 1 < d < M? In this problem, a witness for a “yes” answer is a factor as above, and “no” answer can easily be computed from a complete factorization, which can also be considered as a kind of witness.

The class NP A decision problem is nondeterministically polynomial-time solvable or in the class NP if there exists an algorithm A such that zA takes as inputs potential witnesses for “yes” answers to problem D. zA correctly distinguishes true witnesses from false witnesses. zThere exists a polynomial p such that for each potential witnesses of each instance of size n of D, the execution of the algorithm A takes at most p(n) steps.

The class NP Note that if a problem is in the class NP, then we are able to verify “yes”-answers in polynomial time, provided that we are lucky enough to guess true witnesses. Homework 1: Show that both the factorization problem of a previous slide and its negation are in the class NP.

The P=NP Problem Are the classes P and NP identical? This is an open problem; it may well be the biggest open problem of mathematics at the beginning of the 21st century. It is not hard to show that every problem in P is also in NP, but it is unclear whether every problem in NP is also in P.

The P=NP Problem The best we can say is that thousands of computer scientists have been unsuccessful for decades to design polynomial-time algorithms for some problems in the class NP. This constitutes overwhelming empirical evidence that the classes P and NP are indeed distinct, but no formal mathematical proof of this fact is known.

Polynomial-time reducibility Let E and D be two decision problems. We say that D is polynomial-time reducible to E if there exists an algorithm A such that zA takes instances of D as inputs and always outputs the correct answer “Yes” or “No” for each instance of D. zA uses as a subroutine a hypothetical algorithm B for solving E. zThere exists a polynomial p such that for every instance of D of size n the algorithm A terminates in at most p(n) steps if each call of the subroutine B is counted as only m steps, where m is the size of the actual input of B.

Polynomial-time reducibility The notion of polynomial-time reducibility can be extended to the case where only E is a decision problem. Homework 2: Give the corresponding definition and show that finding the factorization of a given positive integer is polynomial-time reducible to the decision problem about factorization from a previous slide. What does this imply for prime factorization if the latter decision problem is in fact in the class P?

An example of polynomial-time reducibility Theorem: The Hamiltonian cycle problem is polynomial- time reducible to the decision version of TSP. Proof: Given an instance G with vertices v 1, …, v n of the Hamiltonian cycle problem, let H be the weighted complete graph on v 1, …, v n such that the weight of an edge {v i, v j } in H is 1 if {v i, v j } is an edge in G, and is 2 otherwise. Now the correct answer for the instance G of the Hamiltonian cycle problem can be obtained by running an algorithm on the instance (H,n+1) of the TSP.

NP-complete problems A decision problem E is NP-complete if every problem in the class NP is polynomial-time reducible to E. The Hamiltonian cycle problem, the decision versions of the TSP and the graph coloring problem, as well as literally hundreds of other problems are known to be NP-complete. It is not hard to show that if a problem D is polynomial- time reducible to a problem E and E is in the class P, then D is also in the class P. It follows that if there exists a polynomial-time algorithm for the solution of any of the NP-complete problems, then there exist polynomial-time algorithms for all of them, and P= NP.

NP-hard problems Optimization problems whose decision versions are NP- complete are called NP-hard. Theorem: If there exists a polynomial-time algorithm for finding the optimum in any NP-hard problem, then P = NP.

NP-hard problems Proof: Let E be an NP-hard optimization (let us say minimization) problem, and let A be a polynomial-time algorithm for solving it. Now an instance J of the corresponding decision problem D is of the form (I,c), where I is an instance of E, and c is a number. Then the answer to D for instance J can be obtained by running A on I and checking whether the cost of the optimal solution exceeds c. Thus there exists a polynomial-time algorithm for D, and NP-completeness of the latter implies P= NP.

Multiple Sequence Alignment Problem Theorem (Wang and Jiang, 1994): There exists a scoring scheme for which the Multiple Sequence Alignment problem is NP- hard. Unfortunately, the scoring scheme used in this result is biologically unrealistic.

Multiple Sequence Alignment Problem Theorem (Just, 2001): The Multiple Sequence Alignment problem is NP- hard for every biologically realistic scoring matrix with SP-score and affine gap penalties. This has been subsequently extended by another author to some scoring schemes different from SP-score.

Consequences for bioinformatics In view of the overwhelming empirical evidence against the equality P= NP it seems that no NP-hard optimization problem is solvable by an algorithm that is guaranteed to: zrun in polynomial time and zalways produce a solution with optimal score/cost. Unfortunately, many, perhaps most, of the important optimization problems in bioinformatics are NP-hard. To make matters worse, the instances of interest in bioinformatics are typically of large size. What can we do about these problems?

Towards alternative performance measures So far we have been talking about algorithms that zrun in polynomial time on all instances zalways find the solution with the best score/cost. As we have seen, such algorithms may be too much to ask for. We will now briefly discuss how one can meaningfully relax the above requirements.

Worst case vs. average performance So far, we have been insisting that there exists a polynomial p such the running time of an algorithm is bounded by p(n) for all instances of size n. However, for many practical purposes, it may be sufficient to have an algorithm whose average running time for instances of size n is bounded by a polynomial. Such an algorithm may still be unacceptably slow for some particularly bad instances, but such bad instances will necessarily be very rare and may be of little practical relevance.

Approximation algorithms While optimal solutions to optimization problems are clearly best, “reasonably” good solutions are also of value. Let us say that an algorithm for a minimization problem D has a performance guarantee of 1 + ε if for each instance I of the problem it finds a solution whose cost is at most (1 + ε) times the cost of the optimal solution for instance I. While D may be NP-hard, it may still be possible to find, for some ε > 0, polynomial-time algorithms for D with performance guarantee 1 + ε. Such algorithms are called approximation algorithms. For maximization problems, the notion of an approximation algorithm is defined similarly.

Polynomial-time approximation schemes We say that a minimization problem D has a polynomial- time approximation scheme (PTAS) if for every ε > 0 there exists a polynomial-time algorithm for D with performance guarantee 1 + ε. While D may be NP-hard, it may still be have a PTAS.

PTAS for Multiple Sequence Alignment Theorem (Just and Della Vedova 2004) Some versions of the Multiple Sequence Alignment Problem have a PTAS.

Heuristic algorithms Quite often, bioinformaticians rely on heuristic algorithms for solving NP-hard optimization problems. These are algorithms that appear to run reasonably fast on the average instance, appear to find, most of the time, solutions within (1 + ε) of optimum for reasonably small ε. However, it is not always easy or possible to mathematically analyze the performance of a heuristic algorithm.

NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision.

Similar presentations

Presentation on theme: "NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision.

Similar presentations

Presentation on theme: "NP-complete and NP-hard problems. Decision problems vs. optimization problems The problems we are trying to solve are basically of two kinds. In decision."— Presentation transcript:

Similar presentations

About project

Feedback