Presentation is loading. Please wait.

Presentation is loading. Please wait.

242-535 ADA: 17. P and NP1 Objective o look at the complexity classes P, NP, NP- Complete, NP-Hard, and Undecidable problems Algorithm Design and Analysis.

Similar presentations

Presentation on theme: "242-535 ADA: 17. P and NP1 Objective o look at the complexity classes P, NP, NP- Complete, NP-Hard, and Undecidable problems Algorithm Design and Analysis."— Presentation transcript:

1 242-535 ADA: 17. P and NP1 Objective o look at the complexity classes P, NP, NP- Complete, NP-Hard, and Undecidable problems Algorithm Design and Analysis (ADA) 242-535, Semester 1 2014-2015 17. P and NP, and Beyond

2 Overview 8.NPC and Reducability 9.NP-Hard 10.Approximation Algorithms 11.Decidable Problems 12.Undecidable Problems 13.Computational Difficulty 14.Non-Technical P and NP 1.A Decision Problem 2.Polynomial Time Solvable (P) 3.Nondeterministic Polynomial (NP) 4.Nondeterminism 5.P = NP : the BIG Question 6.NP-Complete 7.The Circuit-SAT Problem

3 242-535 ADA: 17. P and NP3 The problems in the P, NP, NP-Complete complexity classes (sets) are all defined as decision problems A decision problem is one where the solution is written as a yes or no answer o Example: Is 29 a prime number? 1. A Decision Problem

4 242-535 ADA: 17. P and NP4 0/1 Knapsack is an optimization problem whose goal is to optimize (maximize) the number of items in a knapsack; the result is numerical. The decision problem version of 0/1 Knapsack determines if a certain number of items is possible without exceeding the knapsack's capacity; the result is yes or no. Usually, the translation from an optimization problem into a decision problem is easy since the required result is simpler. Converting to a Decision Problem

5 242-535 ADA: 17. P and NP5 The P set contains problems that are solvable in polynomial-time o the algorithm has running time O(n k ) for some constant k. o Polynomial times: O(n 2 ), O(n 3 ), O(1), O(n log n) o Not polynomial times: O(2 n ), O(n n ), O(n!) e.g. Testing if a number is prime runs in O(n) time P problems are called tractable because of their polynomial running time. 2. Polynomial Time Solvable (P)

6 242-535 ADA: 17. P and NP6 P Examples

7 242-535 ADA: 17. P and NP7 NP is the class of problems that o can be checked by a polynomial running time algorithm o but can (or might) not be solved in polynomial time Most people believe that the NP problems are not polynomial time solvable (written as P ≠ NP) o this means "cross out the might" Algorithms to solve NP problems require exponential time, or greater o they are called intractable 3. Nondeterministic Polynomial (NP)

8 242-535 ADA: 17. P and NP8 P and NP as Sets P NP Easily solvable problems Easily checkable problems e.g. prime factorization e.g. multiplication, sorting, prime testing P = NP or P ≠ NP

9 242-535 ADA: 17. P and NP9 Computational Difficulty Line P computational difficulty (or hardness) NP =?

10 242-535 ADA: 17. P and NP10 Prime Factorization is the decomposition of a composite number into prime divisors. 24 = 2 x 2 x 2 x 3 91 = 7 x 13 This problem is in NP, not P It is not possible to find ( solve ) a prime factorization of a number in polynomial time o But given a number and a set of prime numbers, we can check if those numbers are a factorization by multiplication, which is a polynomial time operation 3.1. Factorization is in NP

11 242-535 ADA: 17. P and NP11 A Bigger Example ? x ? = Answer is: X

12 242-535 ADA: 17. P and NP12 Spread over 2 years, and finishing in 2009, several researchers factored a 232-digit number utilizing hundreds of machines. Factoring in the Real World

13 242-535 ADA: 17. P and NP13 Is there a Hamiltonian Cycle in a graph G ? A hamiltonian cycle of an undirected graph is a simple cycle that contains every vertex exactly once. 3.2. HC: Another NP Example find a HC

14 242-535 ADA: 17. P and NP14 How would an algorithm find an hamiltonian cycle in a graph G? o One solution is to list all the permutations of G's vertices and check each permutation to see if it is a hamiltonian cycle. What is the running time of this algorithm? o there are |V|! possible permutations, which is an exponential running time So the hamiltonian cycle problem is not polynomial time solvable. Finding a HC

15 242-535 ADA: 17. P and NP15 But what about checking a possible solution? o You are given a set of vertices that are meant to form an hamiltonian cycle o Check whether the set is a permutation of the graph's vertices and whether each of the consecutive edges along the cycle exists in the graph Checking needs polynomial running time Checking a HC

16 242-535 ADA: 17. P and NP16 Deciding whether a Graph G has a Hamiltonian Cycle is not polynomial time solvable but is polynomial time checkable, so is in NP. We'll see later that it is also in the NP-Complete (NPC) set

17 242-535 ADA: 17. P and NP17 A deterministic algorithm behaves predictably. o When it runs on a particular input, it always produces the same output, and passes through the same sequence of states. A nondeterministic algorithm has two stages: 1. Guessing (in nondeterministic polynomial time by generating guesses that are always correct) The (magic) process that always make the right guess is called an Oracle. 2. Verification/checking (in deterministic polynomial time ) 4. Nondeterminism

18 242-535 ADA: 17. P and NP18 Is x in the array A: (3, 11, 2, 5, 8, 16, …, 200 )? Deterministic algorithm o for i=1 to n if (A[i] == x) then print i; return true return false Nondeterministic algorithm o j = choice (1:n) // choice is always correct if (A[j] == x) then print j; return true return false // since j must be correct if A[] has x Example (Searching)

19 242-535 ADA: 17. P and NP19 Another way of thinking of an Oracle is in terms of navigating through a search graph of choices o at each choice point the Oracle will always choose the correct branch which will lead eventually to the correct solution Oracle as Path Finder

20 242-535 ADA: 17. P and NP20 P is the set of decision problems that can be solved by a deterministic algorithm in polynomial time NP is the set of decision problems that can be solved by a nondeterministic algorithm in polynomial time Deterministic algorithms are a 'simple' case of nondeterministic algorithms, and so P ⊆ NP o simple because no oracle is needed P and NP and Determinism

21 242-535 ADA: 17. P and NP21 No one knows how to write a nondeterministic polynomial algorithm, since it depends on a magical oracle always making correct guesses. So in the real world, the solving part of every NP problem has exponential running time, not polynomial. But if someone discovered/implemented an oracle, then NP would becomes polynomial, and P = NP o this is the biggest unsolved question in computing

22 242-535 ADA: 17. P and NP22 On balance it is probably false (i.e. P ≠ NP) Why? o you can't engineer perfect luck and o generating a solution (along with evidence) is harder than checking a possible solution 5. P = NP : the BIG Question

23 242-535 ADA: 17. P and NP23 One possible way to write an oracle would be to use a parallel computer to create an infinite number of processes/threads o have one thread work on each possible guess at the same time o the "infinite" part is the problem Perhaps solvable by using quantum computing or DNA computing (?) o (but probably not) One Possible Oracle

24 242-535 ADA: 17. P and NP24 P = NP Problem Riemann Hypothesis Birch and Swinnerton-Dyer Conjecture Navier-Stokes Problem Poincaré Conjecture  o solved by Grigori Perelman; declined the award in 2010 Hodge Conjecture Yang-Mills Theory Clay Millennium Prize Problems each one is worth US $1,000,000

25 242-535 ADA: 17. P and NP25 P = NP Problem o no need, I hope :) Riemann Hypothesis o involves the zeta function distribution of the primes Birch and Swinnerton-Dyer Conjecture o concerns elliptic (~cubic) curves, their rational coordinates, and the zeta function; wide uses (e.g. in data comms) Navier-Stokes Problem o solve the equation for calculating fluid flow Short Descriptions

26 242-535 ADA: 17. P and NP26 Poincaré Conjecture o distinguish between 3D topological shapes (manifolds) in 4D space; involves shrinking loops around a shape to a point Hodge Conjecture o reshape topological shapes (spaces) made with algebra (polynomial equations) and calculus (integration) techniques so only algebra is needed to define them o a solution would strongly link topology, analysis and algebraic geometry

27 242-535 ADA: 17. P and NP27 Yang-Mills Theory (and the Mass gap Hypothesis) o the Yang-Mills equations allow the accurate calculation of many fundamental physical constants in electromagnetism, strong and weak nuclear forces very useful but there's no solid mathematical theory for it o also need to prove the Mass gap hypothesis which explains why the strong nuclear force don't reach far outside the nucleus

28 242-535 ADA: 17. P and NP28 Two Clay Institute videos o Riemann hypothesis, Birch and Swinnerton-Dyer Conjecture, P vs NP o Poincaré conjecture, Hodge conjecture, Quantum Yang-Mills problem, Navier-Stokes problem A popular maths book about the problems: o The Millennium Problem Keith J. Devlin Basic Book, 2003 o (17 min video):

29 242-535 ADA: 17. P and NP29 A problem is in the NP-Complete (NPC) set if it is in NP and is at least as hard as other problems in NP. Some NPC problems: o Hamiltonian cycle o Path-Finding (Traveling salesman) o Cliques o Map (graph, vertex) coloring o subset sum o Knapsack problem o Scheduling o Many, many more... 6. NP-Complete same hardness or harder

30 242-535 ADA: 17. P and NP30 Computational Difficulty P NP =? NP-complete Hamiltonian Cycle, TSP, Knapsack, Tetris,... computational difficulty

31 242-535 ADA: 17. P and NP31 Drawing P, NP, NPC as Sets P ≠ NP P = NP (= NPC) all problems together in one P set

32 242-535 ADA: 17. P and NP32 The salesperson must make a minimum cost circuit, visiting each city exactly once. 6.1. Travelling Salesman (TSP) 3 1 1 1 2 2 3 4 1 2 2 1 2 2 4 4 1 5 1 2

33 242-535 ADA: 17. P and NP33 A solution (distance = 23): 3 3 4 1 2 2 1 1 11 2 2 2 2 4 1 1 2 4 5 1

34 242-535 ADA: 17. P and NP34 TSPs (and variants) have enormous practical importance Lots of research into good approximation algorithms Recently made famous as a DNA computing problem

35 6.2. 5-Clique Set of vertices such that they all connect to each other.

36 242-535 ADA: 17. P and NP36 A K-clique is a set of K nodes connected by all the possible K(K-1)/2 edges between themK-Cliques This graph contains a 3-clique and a 4-clique

37 242-535 ADA: 17. P and NP37 Given: (G, k) Decision problem: Does G contain a k-clique? Brute Force o Try out all {n choose k} possible locations for the k cliqueK-Cliques

38 6.3. Map Coloring Can a graph be colored with k colors such that no adjacent vertices are the same color?

39 242-535 ADA: 17. P and NP39 Given a set of integers, does there exist a subset that adds up to some target T? Example: given the set {−7, −3, −2, 5, 8}, is there a non-empty subset that adds up to 0. Yes: {−3, −2, 5} Can also be thought of as a special case of the knapsack problem 6.4. Subset-Sum

40 242-535 ADA: 17. P and NP40 The 0-1 knapsack problem: o A thief must choose among n items, where the ith item is worth v i dollars and weighs w i pounds o Carry at most W kg, but make the value maximum Example : 5 items: x1 – x5, values = (4, 2, 2, 1, 10} weights = {12, 2, 1, 1, 4}, W = 15, maximize total value 6.5. The Knapsack Problem x1 x2 x3 x4 x5

41 242-535 ADA: 17. P and NP41Pseudocode def knapsack(items, maxweight): best = {} bestvalue = 0 for s in allPossibleSubsets(items): value = 0 weight = 0 for item in s: value += item.value weight += item.weight if weight <= maxweight: if value > bestvalue: best = s bestvalue = value return best 2 n subsets O(n) for each one Running time is O(n  2 n ) n items

42 242-535 ADA: 17. P and NP42 The bounded knapsack problem removes the restriction that there is only one of each item, but restricts the number of copies of each kind of item The fractional knapsack problem: o the thief can take fractions of items o e.g. the thief is presented with buckets of gold dust, sugar, spices, flour, etc.

43 242-535 ADA: 17. P and NP43 We have N teachers with certain time restrictions, and M classes to be scheduled. Can we: o Schedule all the classes? o Make sure that no two teachers teach the same class at the same time? o No teacher is scheduled to teach two classes at once? 6.6. Class Scheduling Problem

44 242-535 ADA: 17. P and NP44 Most interesting problems are NPC. If you can show that a problem is NPC then: o You can spend your time developing an approximation algorithm rather than searching for a fast algorithm that solves the problem exactly (see later) o or you can look for a special case of the problem, which is polynomial 6.7. Why study NPC?

45 242-535 ADA: 17. P and NP45 Main intellectual export of CS to other disciplines 6,000 citations per year (title, abstract, keywords). o more than "compiler", "operating system", "database" Broad applicability and classification power "Captures vast domains of computational, scientific, mathematical endeavors, and seems to roughly delimit what mathematicians and scientists had been aspiring to compute feasibly." [Papadimitriou 1995] Impact of NPC

46 242-535 ADA: 17. P and NP46 There are over 3000 known NPC problems. Many of the major ones are listed at: o The list is divided into several categories: o Graph theory, Network design, Sets and partitions, Storage and retrieval, Sequencing and scheduling, Mathematical programming, Algebra and number theory, Games and puzzles, and Logic More NPC Problems

47 242-535 ADA: 17. P and NP47 Another way of classifying NPC problems: o Packing problems: SET-PACKING, INDEPENDENT SET o Covering problems: SET-COVER, VERTEX-COVER o Constraint satisfaction problems: SAT, 3-SAT o Sequencing problems: HAMILTONIAN-CYCLE, TSP o Partitioning problems: 3D-MATCHING 3-COLOR o Numerical problems: SUBSET-SUM, KNAPSACK

48 242-535 ADA: 17. P and NP48 There's often not much difference in appearance between easy problems and hard ones: Hard vs Easy Hard Problems (NP-Complete) Easy Problems (in P) SAT, 3SAT2SAT Traveling Salesman ProblemMinimum Spanning Tree 3D MatchingBipartite Matching KnapsackFractional Knapsack

49 242-535 ADA: 17. P and NP49 This was (almost) the first problem to be proved NPC o Cook 1971, Levin 1973 o actually Cook proved SAT was in NPC, but Circuit-SAT is almost the same, and is the example used in CLRS What did Cook do? Remember: a problem is in NPC if is in NP and is at least as hard as other problems in NP 7. The Circuit-SAT Problem

50 242-535 ADA: 17. P and NP50 The Circuit-Satisfiability Problem (Circuit-SAT) is a combinational circuit composed of AND, OR, and NOT gates. A circuit is satisfiable if there exists a set of boolean input values that makes the output of the circuit equal to 1. What is Circuit-SAT?

51 242-535 ADA: 17. P and NP51 A Circuit Logic Gates NOT AND OR 1 1 1 0 0 0 1 1 1 1 1 0 0

52 242-535 ADA: 17. P and NP52 Satisfiablity Examples Circuit (a) is satisfiable since = makes the output 1. (a) Satisfiable (a) Unsatisfiable No 1 is possible No 1 is possible

53 242-535 ADA: 17. P and NP53 If there are n inputs, then there are 2 n possible assignments that need to be tried out in order to find the solution o solving is O(2 n ), exponential running time Testing a solution is done by evaluating the circuit's logical expression which can be done in linear running time o testing is O(n), polynomial running time Part 1. Circuit-SAT is NP

54 242-535 ADA: 17. P and NP54 Cook’s Theorem o it proves (Circuit-) SAT is NPC from first principles, based on its implementation of the Turing machine All other problems have been shown to be NPC by reducing (transforming) Circuit-SAT into those problems (either directly or indirectly) Part 2. Circuit-SAT is NPC

55 242-535 ADA: 17. P and NP55 A Turing machine is a theoretical device composed of an infinitely long tape with printed symbols representing instructions. The tape can move backwards and forwards in the machine, which can read the instructions and write results back onto the tape. What's a Turing Machine? machine with finite number of states symbols on tape moving tape sensor to read, write, or erase

56 242-535 ADA: 17. P and NP56 Invented in 1936 by Alan Turing. Despite its simplicity, a Turing machine can simulate the logic of any computing machine or algorithm. o i.e. a function is algorithmically computable if and only if it is computable by a Turing machine o called the Church–Turing thesis (hypothesis)

57 242-535 ADA: 17. P and NP57 Any algorithm can be implemented as a Turing machine o this is how "computable" is formally defined Every problem in NP can be transformed into a program running on a (non-deterministic) Turing machine. (Church-Turing) o "non-deterministic" means that the finite state machine used by the Turing machine is non-deterministic Algorithms & Turing Machines

58 242-535 ADA: 17. P and NP58 Cook's Theorem argues that every feature of the Turing machine, including how instructions are executed, can be written as logical formulas in a satisfiability (SAT) problem. (cook 1) So any a program executing on a Turing machine can be transformed into a logical formula of satisfiability. (cook 2) Back to Cook's Theorem

59 242-535 ADA: 17. P and NP59 Combining Church-Turing and cook 2 means that every problem in NP can be transformed into a logical formula of satisfiability. (cook 3) This means that the satisfiability problem is at least as hard as any problem in NP ( cook 4 ) By definition(see slide 49), a problem is in NPC if it is in NP and is at least as hard as other problems in NP o so (Circuit-) SAT is in NPC

60 242-535 ADA: 17. P and NP60 Cook's Theorem as Pictures an algorithm A for checking the solution data y against problem x in polynomial time Ax y TM's tape... problem x solution data y YES/NO polynomial time translation Turing Machine TM for executing A on x and y in polynomial time TM ≈ simulation of a simple computer

61 242-535 ADA: 17. P and NP61 Turing Machine TM for executing A on x and solution data y in polynomial time Ax y TM's tape... polynomial time translation Circuit-SAT C binary inputs (...101011000...) representing A, x and y output of 1 means A has checked the y against x TM ≈ simulation of a simple computer C ≈ a simple real computer

62 242-535 ADA: 17. P and NP62 The Turing Machine runs the program A (run-time software) against the input x (program) and the solution data y (program's input). This Turing machine can be implemented as a Circuit-SAT 'computer' made up of AND, OR, and NOT gates. Since Circuit-SAT can implement any problem (i.e. the A, x, and y) then it is as least as hard (complex) as any other problem, and so is NP-Complete. Informal Explanation

63 242-535 ADA: 17. P and NP63 The Turing Machine in Action M is the Turing Machine, or its equivalent logical formula in Circuit-SAT

64 242-535 ADA: 17. P and NP64 The checking software (A) executes through a series of states ( configurations ) until it outputs the satisfiability of y (solution data) against the problem x o this is reported as a 1 (YES) or 0 (NO) written to the working storage area Everything is polynomial O(n k ), where n is the size of the input: o the tape length o A's running time o the number of configurations

65 242-535 ADA: 17. P and NP65 A formal defintion of NPC requires the reducability of one problem into another. If we can reduce (tranform) a problem S into a problem L in polynomial time, then S is not polynomial factor harder than L o this is written as : S ≤p L Harder means "bigger running time" 8. NPC and Reducability

66 242-535 ADA: 17. P and NP66 First show that the problem L is in NP o show that it can be solved in exponential time o Show that it can be checked in polynomial time Show how to reduce (transform) an existing NPC problem S into L (in polynomial time) o S is NPC o S ≤ p L o Then L is NPC Is a Problem 'L' in NPC? A known NPC problem S A problem L to be proved NPC reduce

67 242-535 ADA: 17. P and NP67 This reduction means that the existing NPC problem (S) is no harder than the new problem L o harder means "bigger running time" This hardness 'link' between NPC problems means that all NPC problems have a similar (or smaller) hardness, and that all NP problems are less hard. If a researcher proves that any NPC problem can be solved in polynomial time, then all problems in NPC and NP must also have polynomial running time algorithms. o in other words P = NP

68 242-535 ADA: 17. P and NP68 Proving Problems are NPC use reductions

69 242-535 ADA: 17. P and NP69 Given a Boolean expression on n variables, can we assign values such that the expression is TRUE? Example: o (x 1  x 2 )   ((  x 1  x 3 )  x 4 ))  x 2 A satisfying truth assignment: o = 8.1. The Satisfiability Problem (SAT)

70 242-535 ADA: 17. P and NP70 Reduction Graphically φ = x10 ∧ (x4 ↔ ¬ x3) ∧ (x5 ↔(x1 ∨ x2)) ∧ (x6 ↔ ¬ x4) ∧ (x7 ↔(x1 ∧ x2 ∧ x4)) ∧ (x8 ↔(x5 ∨ x6)) ∧ (x9 ↔(x6 ∨ x7)) ∧ (x10 ↔ (x7 ∧ x8 ∧ x9)) with additional 'gate' variables

71 242-535 ADA: 17. P and NP71 SAT is an NP problem. Prove that Circuit-SAT ≤ P SAT o remember the left-hand-side must be an existing NPC problem For each wire xi in circuit C, the formula φ has a variable xi. The operation of a gate is expressed as a formula with associated variables, e.g., x10 ↔ (x7 ∧ x8 ∧ x9). Is SAT a NPC Problem?

72 242-535 ADA: 17. P and NP72 φ = AND of the circuit-output variable with the conjunction ( ∧ ) of clauses describing the operation of each gate, e.g., o φ = x10 ∧ (x4 ↔ ¬ x3) ∧ (x5 ↔(x1 ∨ x2)) ∧ (x6 ↔ ¬ x4) ∧ (x7 ↔(x1 ∧ x2 ∧ x4)) ∧ (x8 ↔(x5 ∨ x6)) ∧ (x9 ↔(x6 ∨ x7)) ∧ (x10 ↔ (x7 ∧ x8 ∧ x9)) Circuit C is satisfiable ⇔ formula φ is satisfiable. Given a circuit C, it takes polynomial time to construct φ.

73 242-535 ADA: 17. P and NP73 3SAT: the Satisfiability of boolean formulas in 3-conjunctive normal form (3-CNF). 3SAT is NP Show that 3SAT is NPC by proving SAT ≤ P 3SAT. Four stages to the transformation. 1. Construct a binary “parse” tree for input formula φ and introduce a variable yi for the output of each internal node. 8.2. The 3-SAT Problem

74 242-535 ADA: 17. P and NP74 A Boolean formula is in conjunctive normal form if it is an AND of clauses, each of which is an OR of literals o e.g. (x 1   x 2 )  (  x 1  x 3  x 4 )  (  x 5 ) 3-CNF : each clause has exactly 3 distinct literals o e.g. (x 1   x 2   x 3 )  (  x 1  x 3  x 4 )  (  x 5  x 3  x 4 ) o Note: the formula is true if at least one literal in each clause is true Conjunctive Normal Form (CNF)

75 242-535 ADA: 17. P and NP75 φ = ((x1→ x2) ∨ ¬ ((¬ x1 ↔ x3) ∨ x4)) ∧ ¬ x2 3-SAT Parse Tree y variables are new

76 242-535 ADA: 17. P and NP76 2. Rewrite φ as the AND of the root variable and a conjunction of clauses describing the operation of each node: φ' = y1 ∧ (y1 ↔(y2 ∧ ¬ x2)) ∧ (y2 ↔(y3 ∨ y4)) ∧ (y3 ↔(x1→x2)) ∧ (y4 ↔ ¬ y5) ∧ (y5 ↔ (y6 ∨ x4)) ∧ (y6 ↔(¬ x1 ↔ x3)) Stages 2 and 3 parent node  children

77 242-535 ADA: 17. P and NP77 3. Convert each ith sub-clause in φ' into conjunctive normal form. Several sub-steps are required for each φ'i : o create a truth table o use the truth table to create a disjunctive normal form (DNF), called ¬φ'i o use DeMorgan's laws to change the DNF to conjunctive normal form (CNF), calling the result φ''i

78 242-535 ADA: 17. P and NP78 φ'1 = (y1 ↔(y2 ∧ ¬ x2)) The truth table for this clause: used by DNF Sub-Steps Example

79 242-535 ADA: 17. P and NP79 Create the disjunctive normal form (DNF) ¬φ'1 using the 0 result rows of the truth table: ¬φ'1 = (y1 ∧ y2 ∧ x2) ∨ (y1 ∧ ¬ y2 ∧ x2) ∨ (y1 ∧ ¬ y2 ∧ ¬x2) ∨ (¬ y1 ∧ y2 ∧ ¬ x2) Now use DeMorgan's laws to make ¬φ'1 positive (now called φ''1 ), and change the DNF right-hand side to CNF: φ''1 = ¬ (¬ φ'1) = (¬ y1 ∨ ¬ y2 ∨ ¬ x2) ∧ (¬ y1 ∨ y2 ∨ ¬ x2) ∧ (¬ y1 ∨ y2 ∨ x2) ∧ (y1 ∨ ¬ y2 ∨ x2)

80 242-535 ADA: 17. P and NP80 DNF: A Boolean formula is in disjunctive normal form if it is an ORof clauses, each of which is an AND of literals o e.g. (x 1   x 2 )  (  x 1  x 3  x 4 )  (  x 5 ) DeMorgan's Laws: DNF and DeMorgan's

81 242-535 ADA: 17. P and NP81 4. Pad out each clause Ci so it has 3 distinct literals. The result is φ'''. o Ci has 3 distinct literals: no padding required o Ci has 2 distinct literals; add 1 dummy: Ci = (l1 ∨ l2) = (l1 ∨ l2 ∨ p ) ∧ (l1 ∨ l2 ∨ ¬p ) o Ci has 1 literal; add 2 dummies: Ci = l = (l ∨ p ∨ q ) ∧ (l ∨ ¬ p ∨ q ) ∧ (l ∨ p ∨ ¬ q ) ∧ (l ∨ ¬ p ∨ ¬ q ) Stage 4

82 242-535 ADA: 17. P and NP82 Claim: The 3-CNF formula φ''' is satisfiable ⇔ φ is satisfiable. All transformations can be done in polynomial time o O(m*n) (m = no. of clauses, n = no. of literals)

83 From SAT to 3-SAT reduction The Y literals are the added dummies to pad out the clauses.

84 242-535 ADA: 17. P and NP84 The 3-SAT problem is relatively easy to reduce to others problems Thus by proving 3-SAT is NPC we can prove that many other problems are NPC o see the reduction tree on slide 62 Why Bother with 3-SAT?

85 242-535 ADA: 17. P and NP85 A clique in G= (V, E) is a complete subgraph of G. For a positive integer k ≤ |V|, is there a clique V’ ⊆ V of size ≥ k? Clique is NP. Show that Clique is NPC by using 3SAT ≤ P Clique 8.3. The Clique Problem

86 242-535 ADA: 17. P and NP86 A 3-CNF such as: will be transformed into the graph shown on the next slide. Each 3-element sub-clause becomes a triple of nodes. Nodes are linked by edges: o if they are not in the same triple o if they are not opposites (inconsistent) Example Transformation

87 242-535 ADA: 17. P and NP87 Graph, G G can be computed from φ in polynomial time.

88 242-535 ADA: 17. P and NP88 1. φ is satisfiable ⇒ G has a clique of size k 3SAT φ is satisfiable is there's at least 1 literal in each clause that is true in: For example: x2 = 0, x3 = 1 and x1 can be 0 or 1 This corresponds to picking  x2 and x3 in each graph triple, which makes a 3-clique. Are φ and G the Same?

89 242-535 ADA: 17. P and NP89 2. G has a clique of size k ⇒ φ is satisfiable Selecting a k-clique means that one node is selected in each graph triple. This is the same as making one literal in each clause of φ equal to true.

90 242-535 ADA: 17. P and NP90 A NP-Hard problem is at least as hard as the hardest problems in NP (i.e. NPC problems). A NP-Hard problem does not have a polynomial time checking algorithm o this makes most NP-hard problems harder to solve than NP problems 9. NP-Hard

91 242-535 ADA: 17. P and NP91 For example, the optimization version of Traveling Salesman where we need to find an actual schedule is harder than the decision version of traveling salesman where we just need to determine whether a schedule with length <= k exists or not. The same is true for the optimization version of the Knapsack problem vs its decision problem version.

92 242-535 ADA: 17. P and NP92 Computational Difficulty P NP NP-hard =? NP-complete Tetris, etc. computational difficulty chess halting problem halting problem

93 242-535 ADA: 17. P and NP93 Drawing P, NP, etc. as Sets Even if P = NP, NP-Hard can be harder

94 242-535 ADA: 17. P and NP94 NP-hard problems are not all NP, despite having NP in the name. There are problems that are NP-hard but not NP-complete o e.g. chess, the halting problem NP-Hard: a Bad Name

95 242-535 ADA: 17. P and NP95 Approximation algorithms: o guarantee to be a fixed percentage away from the optimum Pseudo-polynomial time algorithms: o e.g., dynamic programming for the 0-1 Knapsack problem Probabilistic algorithms: o assume some probabilistic distribution of the data Randomized algorithms: o use randomness to get a faster running time, and allow the algorithm to fail with some small probability o e.g. Monte Carlo method, Genetic algorithms Coping with NPC/NP-Hard Problems

96 242-535 ADA: 17. P and NP96 Restriction : work on special cases of the original problem which run better Exponential algorithms / Branch and Bound / Exhaustive search: o feasible only when the problem size is small. Local search : o simulated annealing (hill climbing) Heuristics o use "rules of thumb" that have no formal guarantee of performance

97 242-535 ADA: 17. P and NP97 Exhaustive Search v.s. Branch and Bound

98 242-535 ADA: 17. P and NP98 Simulated Annealing Use probabilities to jump around in a large search space looking for a good enough solution called a local optimum. By contrast, the actual best solution is the global optimum.

99 242-535 ADA: 17. P and NP99 An approximation algorithm is one that returns a near-optimal solution. The quality of the approximation can be measured by comparing the 'cost' of the optimal solution against the approximate one using ratios o called the approximation ratio, ρ(n) o n is the input size Cost can be any aspect of the algorithm o e.g. number of nodes, edge distance, running time, space 10. Approximation Algorithms

100 242-535 ADA: 17. P and NP100 C = cost of the approximation algorithm C* = cost of an optimal solution In some applications, we wish to maximize the 'cost', in others we want to minimize it. o In maximization problems, we use C*/C to get a measure of how much bigger C* is than C o In minimization problems, we use C/C* to get a measure of how much smaller C* is than C. Calculating ρ(n)

101 242-535 ADA: 17. P and NP101 Approximation (Cost) Ratio ρ(n): ρ(n) is always ≥ 1 Bigger values for ρ(n), mean worse approaximations An approximate algorithm that has a ratio == ρ(n) is called a ρ(n)-approximation algorithm.

102 242-535 ADA: 17. P and NP102 Approx-TSP-Tour(G) a vertex r ∈ V[G] to be a “root” vertex 2.grow a minimum spanning tree (MST) for G from root r using Prim() 3.create a list L of vertices visited in a preorder tree walk over the MST 4.return Hamiltonian cycle using L for its order Time complexity: O(V log V). 10.1. Approx. Algorithm for TSP

103 242-535 ADA: 17. P and NP103 Approx-TSP Example (a)Undirected Graph (b) MST created with Prim() an integer unit grid

104 242-535 ADA: 17. P and NP104 (c) Preorder walk over the MST (d) Tour using the preorder MST. Preorder walk: a, b, c, b, h, b, a, d, e, f, e, g, e, d, a A preorder walk that lists a vertex only when it is first encountered, as indicated by the dot next to each vertex: a, b, c, h, d, e, f, g Cost of tour: ~19.074 (by measuring squares)

105 242-535 ADA: 17. P and NP105 We want to compare an optimal tour with a tour using a preorder MST. For this example: What are We Comparing? Tour using the preorder MST Cost of tour: ~19.074 Optimal Tour Cost of tour: ~14.715

106 242-535 ADA: 17. P and NP106 Let H* denote an optimal tour, T the minimal spanning tree, W a full walk over the graph, and H our preorder MST approximation. Cost aim is to minimize the distance. The cost of the minimal spanning tree is a lower bound for the cost of the optimal tour: c(T) ≤ c(H*)(1) An example of a full walk W is shown in (c) on the previous slide, which involves traversing every edge twice. This means: c(W) ≤ 2*c(T)(2) Calculating the Approximation Ratio

107 242-535 ADA: 17. P and NP107 Combining equations (1) and (2): c(W) ≤ 2*c(H*)(3) The preorder MST, H, is much shorter than the full walk (see (d) on slide 98), so: c(H) ≤ c(W) (4) Combining equations (3) and (4): c(H) ≤ 2*c(H*)

108 242-535 ADA: 17. P and NP108 This means that the cost of H (the preorder MST) is a factor of 2 greater than the cost of H* (the optimal tour), and so ρ(n) ≥ 2 the minimal distance ratio is C/C* = c(H)/c(H*) = 2 We can say that the preorder MST version of TSP is a 2-approximation algorithm o which indicates that it is a good approximation

109 242-535 ADA: 17. P and NP109 Greedy algorithms progress by making locally optimal decisions Greedy algorithms (such as Prim's) are not guaranteed to find the best (globally optimal) answer, but they often return good approximations. 10.2. Greedy Algorithms

110 242-535 ADA: 17. P and NP110 Greedy Knapsack Algorithm def knapsack_greedy(items, maxweight): result = [] weight = 0 while True weightleft = maxweight - weight bestitem = None for item in items //grab the biggest that fits if item.weight <= weightleft and (bestitem == None or item.value > bestitem.value): bestitem = item if bestitem == None: break else result.append (bestitem) weight += bestitem.weight return result Running Time is O (n 2 ) Running Time is O (n 2 )

111 242-535 ADA: 17. P and NP111 No (but close). Proof by counterexample: Consider the input items = {, } maxweight = 3 Greedy algorithm picks { } o value = 110, but {, “silver”>} has weight <= 3 and value = 180 Is Greedy Knapsack Optimal?

112 242-535 ADA: 17. P and NP112 Decidable problems are often split into three categories: o Tractable problems (polynomial) o NPC problems o Intractable problems (exponential) sometimes grouped in the set EXP All of the above have algorithmic solutions, even if they have impractical running times. o All of these are placed in a group called R o R = { set of problems solvable in finite time } 11. Decidable Problems

113 242-535 ADA: 17. P and NP113 An undecidable problem is a decision problem for which it is impossible to construct a single algorithm that always leads to a correct yes-or-no answer, or might never give any answer at all. In summary: o the answer may be wrong, or o no answer may appear, no matter how long you wait No correct algorithm can be written for an undecidable problem. 12. Undecidable Problems

114 242-535 ADA: 17. P and NP114 Computational Difficulty P NP EXP R NP-hard EXP-hard =? NP-complete EXP-complete tetris, etc chess halting problem halting problem computational difficulty undecidable decidable

115 242-535 ADA: 17. P and NP115 Decide whether a given program with some input finishes, or continues to run forever. The halting problem is undecidable. o this means that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist 12.1. The Halting Problem

116 242-535 ADA: 17. P and NP116 No. It's very difficult for a human to decide if a real- size program will actually finish. Here's a very simple bit of code. Does it finish for all inputs? read n; while(n != 1) { if (even(n)) n = n/2; else n = 3*n + 1; print n; } Isn't Detecting Halting Easy?

117 242-535 ADA: 17. P and NP117 The maths behind this function goes by various names, including the Collatz conjecture. o The conjecture is that no matter what positive number you start with, you will always eventually reach 1. Examples o start with n = 6, the sequence is 6, 3, 10, 5, 16, 8, 4, 2, 1. o n = 11, sequence is 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1 These sequences are referred to as hailstone sequences.

118 242-535 ADA: 17. P and NP118 The sequence for n = 27, loops 111 times, climbing to over 9000 before descending to 1

119 242-535 ADA: 17. P and NP119 The conjecture is a famous unsolved problem in Mathematics. Paul Erdős said "Mathematics is not yet ripe for such problems" but offered $500 for its solution. The Wikipedia page is very informative o

120 242-535 ADA: 17. P and NP120 Let's assume there is a halt(String p, String i) function which implements the halting problem. It takes two inputs, an encoded representation of a program as text and the program's input data (also as text). halt() return true or false depending on whether or not p will halt when executed on the i input. Why the Halting Problem is Undecidable halt() p i true or false p halts on i p does not halt on i

121 242-535 ADA: 17. P and NP121 halts(“def sqr(x): return x * x", "sqr(4)") returns true halts(“def fac(n): if (n == 0) return 1 else return n * fac(n-1)", "fac(10)”) returns true halts(“def foo(n): while (true) { n = n + 1}", "foo(4)") returns falseExamples

122 242-535 ADA: 17. P and NP122 Let's create a 2nd function called trouble(String q ). It creates two copies of its input q, and passes them as arguments to halt(). o trouble(q) calls halt(q, q) This means that q is used as both the program and its data in halt() Now for Trouble

123 242-535 ADA: 17. P and NP123 If halt(q,q) outputs false, then trouble() returns true and finishes. If halt(q,q) outputs true, then trouble() goes into an unending loop (and thus does not finish). Implementing trouble() trouble() q true; and finishes or never finishes halt(q,q) is false halt(q,q) is true

124 242-535 ADA: 17. P and NP124 boolean trouble(String q) { if (!halt(q, q)) return true; else { // halt(q, q) is true while (true) { // loop forever // do nothing } return false; // execution never gets here }

125 242-535 ADA: 17. P and NP125 The input string q to trouble() can be anything, so let's encode the trouble() function itself as a string called t, and use that as input. What is the result of trouble(t) ? There are two possibilities: either trouble() returns true (i.e. halts), or loops forever (it does not halt). Let's look at both cases. Double Trouble

126 242-535 ADA: 17. P and NP126 1. trouble(t) halts which means that halt(t,t) returned false. But halt() is saying that trouble(t) does not halt. Contradiction. 2. trouble(t) does not halt which means that halt(t,t) returned true. But halt() is saying that trouble(t) does halt. Contradiction Two Cases

127 242-535 ADA: 17. P and NP127 In both cases, there's a logical contradiction because of halt(). This means that it is possible to construct an program/input pair for halt() which it processes incorrectly. This means that there is no halt() algorithm which consistently works for all inputs. o the Halting Problem is undecidable Contradictions are Deadly

128 242-535 ADA: 17. P and NP128 In 1961, Wang conjectured that if a finite set of tiles can tile the plane, then there exists a periodic tiling o i.e. a sort of wallpaper pattern is possible But in 1966, Robert Berger proved that no such algorithm exists o he used the undecidability of the halting problem to imply the undecidability of this tiling problem 'Pratically' this means that a finite set of Wang tiles that tiles the plane can only do so aperiodically o i.e. the pattern never repeats 12.2. Wang Tiling

129 242-535 ADA: 17. P and NP129 13 Wang tiles that tile aperiodically:Example Aperiodic example:

130 242-535 ADA: 17. P and NP130 Wang tiles have become a popular tool for procedural synthesis of textures, heightfields, and other large and nonrepeating data sets. o e.g. for decorating 3D gaming scenes A small set of precomputed or hand-made source tiles can be assembled very cheaply without too obvious repetitions and without periodicity.Uses

131 242-535 ADA: 17. P and NP131 A good list can be found at: o The problem categories are mathematical, as you'd expect: o logic, abstract machines, matrices, combinatorial group theory, topology, analysis, others 12.3. Other Undecidable Problems

132 242-535 ADA: 17. P and NP132 13. Computational Difficulty

133 242-535 ADA: 17. P and NP133 Adding memory space and randomization to running time creates new complexity classes o PSPACE is the set of all decision problems that can be solved using a polynomial amount of space o EXPSPACE uses an exponential amount of space o BPP ( bounded-error probabilistic polynomial time)uses a probabilistic algorithm in polynomial time, with an error probability of at most 1/3 for all instances o NPSPACE is equivalent to PSPACE, because a deterministic Turing machine can simulate a nondeterministic Turing machine without needing much more space.What!

134 242-535 ADA: 17. P and NP134 co-NP is the class of problems for which polynomial checking times are possible for counterexamples of the original NP problem o A counterexample is where the decision problem is negated, and you find a YES/NO answer for that question.

135 242-535 ADA: 17. P and NP135 P, NP, NPC, NP-Hard, EXP, R, EXP-Hard,... PSPACE, NPSPACE, EXPSPACE,... o quite a few complexity classes (sets) o is that all of them? No, the "Complexity Zoo" website currently lists 495 classes o But the "petting zoo" only lists 16. Complexity Classes

136 242-535 ADA: 17. P and NP136 Video talk " Beyond Computation: The P vs NP Problem " Michael Sipser, Dept. of Maths, MIT Oct. 2006 14. Non-Technical P and NP

137 242-535 ADA: 17. P and NP137 The Complexity of Songs Donald Knuth (1977) CACM, 1984, 27 (4) pp. 344–346 o joke article about complexity o analyzes the very low complexity of pop songs o knuth_song_complexity.pdf

Download ppt "242-535 ADA: 17. P and NP1 Objective o look at the complexity classes P, NP, NP- Complete, NP-Hard, and Undecidable problems Algorithm Design and Analysis."

Similar presentations

Ads by Google