Presentation is loading. Please wait.

Presentation is loading. Please wait.

10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW

Similar presentations


Presentation on theme: "10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW"— Presentation transcript:

1 10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW

2 Motivation Will a problem be satisfiable or unsatisfiable? Will it be hard or easy? How can we develop heuristics for a new problem? 10/7/2014

3 Take home messages Hard problems often associated with a phase transition  Under constrained, easy  Critically constrained, hard  Over constrained, easier 10/7/2014

4 Provide definition of constrainedness Predict location of such phase transitions Can be measured during search  Observe beautiful “knife-edge” Build heuristics to get off this knife-edge 10/7/2014

5 Let’s start with the mother of all NP-complete problems! 10/7/2014

6 3-SAT  Where are the hard 3-SAT problems?  Sample randomly generated 3-SAT Fix number of clauses, l Number of variables, n By definition, each clause has 3 variables Generate all possible clauses with uniform probability

7 10/7/2014 Random 3-SAT Which are the hard instances?  around l/n = 4.3 What happens with larger problems? Why are some dots red and others blue? This is a so-called “phase transition”

8 10/7/2014 Random 3-SAT Varying problem size, n Complexity peak appears to be largely invariant of algorithm  complete algorithms like Davis-Putnam  Incomplete methods like local search What’s so special about 4.3?

9 10/7/2014 Random 3-SAT Complexity peak coincides with satisfiability transition  l/n < 4.3 problems under- constrained and SAT  l/n > 4.3 problems over- constrained and UNSAT  l/n=4.3, problems on “knife-edge” between SAT and UNSAT

10 10/7/2014 Where did this all start? At least as far back as 60s with Erdos & Renyi  thresholds in random graphs Late 80s  pioneering work by Karp, Purdom, Kirkpatrick, Huberman, Hogg … Flood gates burst  Cheeseman, Kanefsky & Taylor’s IJCAI-91 paper

11 10/7/2014 What do we know about this phase transition? It’s shape  Step function in limit [Friedgut 98] It’s location  Theory puts it in interval: 3.42 < l/n <  Experiment puts it at: l/n = 4.2

12 10/7/2014 3SAT phase transition Lower bounds (hard)  Analyse algorithm that almost always solves problem  Backtracking hard to reason about so typically without backtracking Complex branching heuristics needed to ensure success But these are complex to reason about

13 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions

14 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X]

15 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] No assumptions about the distribution of X except non- negative!

16 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem

17 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem The expected value of X can be easily calculated

18 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l

19 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] =1) = prob(SAT) < 1

20 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1

21 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1 n + l log2(7/8) < 0

22 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  E.g. Markov (or 1st moment) method For any statistic X prob(X>=1) <= E[X] Let X be the number of satisfying assignments for a 3SAT problem E[X] = 2^n * (7/8)^l If E[X] < 1, then 2^n * (7/8)^l < 1 n + l log2(7/8) < 0 l/n > 1/log2(8/7) = 5.19…

23 10/7/2014 3SAT phase transition Upper bounds (easier)  Typically by estimating count of solutions  To get tighter bounds than 5.19, can refine the counting argument E.g. not count all solutions but just those maximal under some ordering

24 10/7/2014 Random 2-SAT 2-SAT is P  linear time algorithm Random 2-SAT displays “classic” phase transition  l/n < 1, almost surely SAT  l/n > 1, almost surely UNSAT  complexity peaks around l/n=1 x1 v x2, -x2 v x3, -x1 v x3, …

25 10/7/2014 Phase transitions in P 2-SAT  l/n=1 Horn SAT  transition not “sharp” Arc-consistency  rapid transition in whether problem can be made AC  peak in (median) checks

26 10/7/2014 Phase transitions above NP PSpace  QSAT (SAT of QBF)  x1  x2  x3. x1 v x2 & -x1 v x3

27 10/7/2014 Phase transitions above NP PSpace-complete  QSAT (SAT of QBF)  stochastic SAT  modal SAT PP-complete  polynomial-time probabilistic Turing machines  counting problems  #SAT(>= 2^n/2)

28 10/7/2014 Exact phase boundaries in NP Random 3-SAT is only known within bounds  3.42 < l/n < Exact NP phase boundaries are known:  1-in-k SAT at l/n = 2/k(k-1) Are there any NP phase boundaries known exactly?

29 10/7/2014 Backbone Variables which take fixed values in all solutions  alias unit prime implicates Let f k be fraction of variables in backbone  in random 3-SAT l/n < 4.3, f k vanishing (otherwise adding clause could make problem unsat) l/n > 4.3, f k > 0 discontinuity at phase boundary!

30 10/7/2014 Backbone Search cost correlated with backbone size  if f k non-zero, then can easily assign variable “wrong” value  such mistakes costly if at top of search tree One source of “thrashing” behaviour  can tackle with randomization and rapid restarts Can we adapt algorithms to offer more robust performance guarantees?

31 10/7/2014 Backbone Backbones observed in structured problems  quasigroup completion problems (QCP) Backbones also observed in optimization and approximation problems  coloring, TSP, blocks world planning … Can we adapt algorithms to identify and exploit the backbone structure of a problem?

32 10/7/ p-SAT Morph between 2-SAT and 3- SAT  fraction p of 3-clauses  fraction (1-p) of 2-clauses 2-SAT is polynomial (linear)  phase boundary at l/n =1  but no backbone discontinuity here! 2+p-SAT maps from P to NP  p>0, 2+p-SAT is NP-complete

33 10/7/ p-SAT phase transition

34 10/7/ p-SAT phase transition l/n p

35 10/7/ p-SAT phase transition Lower bound  are the 2-clauses (on their own) UNSAT?  n.b. 2-clauses are much more constraining than 3- clauses p <= 0.4  transition occurs at lower bound  3-clauses are not contributing!

36 10/7/ p-SAT backbone f k becomes discontinuous for p>0.4  but NP-complete for p>0 ! search cost shifts from linear to exponential at p=0.4 similar behavior seen with local search algorithms Search cost against n

37 10/7/ p-SAT trajectories Input 3-SAT to a SAT solver like Davis Putnam REPEAT assign variable  Simplify all unit clauses  Leaving subproblem with a mixture of 2 and 3-clauses For a number of branching heuristics (e.g random,..)  Assume subproblems sample uniformly from 2+p-SAT space  Can use to estimate runtimes!

38 10/7/ p-SAT trajectories UNSAT SAT

39 10/7/2014 Beyond 2+p-SAT Optimization  MAX-SAT Other decision problems  2-COL to 3-COL  Horn-SAT to 3-SAT  XOR-SAT to 3-SAT  1-in-2-SAT to 1-in-3-SAT  NAE-2-SAT to NAE-3-SAT ..

40 10/7/2014 COL Graph colouring  Can we colour graph so that neighbouring nodes have different colours? In k-COL, only allowed k colours  3-COL is NP-complete  2-COL is P

41 10/7/2014 Random COL Sample graphs uniformly  n nodes and e edges Observe colourability phase transition  random 3-COL is "sharp", e/n = approx 2.3  BUT random 2-COL is not "sharp" As n->oo e/n=0) = 1 e/n=0.45) = approx 0.5 e/n=1) = 0

42 10/7/ p-COL Morph from 2-COL to 3-COL  fraction p of 3 colourable nodes  fraction (1-p) of 2 colourable nodes Like 2+p-SAT  maps from P to NP  NP for any fixed p>0 Unlike 2+p-SAT  maps from coarse to sharp transition

43 10/7/ p-COL

44 10/7/ p-COL sharpness p=0.8

45 10/7/ p-COL search cost

46 10/7/ p-COL Sharp transition for p>0.8 Transition has coarse and sharp regions for 0

47 10/7/2014 Location of phase boundary For sharp transitions, like 2+p-SAT: As n->oo, if l/n = c+epsilon, then UNSAT l/n = c-epsilon, then SAT For transitions like 2+p-COL that may be coarse, we identify the start and finish:  delta 2+p = sup{e/n | prob(2+p-colourable) = 1}  gamma 2+p = inf{e/n | prob(2+p-colourable) = 0}

48 10/7/2014 Basic properties  monotonicity: delta <= gamma  sharp transition iff delta=gamma  simple bounds: delta_2+p = 0 for all p<1 gamma_2 <= gamma_2+p <= min(gamma_3,gamma_2/1-p)

49 10/7/ p-COL phase boundary

50 10/7/2014 XOR-SAT  Replace or by xor  XOR k-SAT is in P for all k Phase transition  XOR 3-SAT has sharp transition  <= l/n <= [Creognou et al 2001]  Statistical mechanics gives l/n = [Franz et al 2001]

51 10/7/2014 XOR-SAT to SAT Morph from XOR-SAT to SAT  Fraction (1-p) of XOR clauses  Fraction p of OR clauses NP-complete for all p>0  Phase transition occurs at:  0.92 <= l/n <= min(0.92/1-p, 4.3) Upper bound appears loose for all p>0  Polynomial subproblem does not dominate!  3-SAT contributes (cf 2+p-SAT, 2+p-COL)

52 10/7/2014 Other morphs between P and NP NAE 2+p-SAT  NAE = not all equal  NAE 2-SAT is P, NAE 3-SAT is NP-complete 1-in-2+p-SAT  1-in-k SAT = exactly one in k literals true  1-in-2 SAT is P, 1-in-3 SAT is NP-complete …

53 10/7/2014 NAE to SAT Morph between two NP-complete problems  Fraction (1-p) of NAE 3-SAT clauses  Fraction p of 3-SAT clauses Each NAE 3-SAT clause is equivalent to two 3-SAT clauses  NAE 3-SAT phase transition occurs around l/n = 2.1 Tantalisingly close to half of 4.2  NAE(a,b,c) = or(a,b,c) & or(-a,-b,-c) Can we ignore many of the correlations that this encoding of NAE SAT into SAT introduces?

54 10/7/2014 NAE to SAT Compute “effective” clause size  Consider (1-p)l NAE 3-SAT clauses and pl 3-SAT clauses  These behave like 2(1-p)l 3-SAT clauses and pl 3-SAT clauses  That is, (2-p)l 3-SAT clauses  Hence, effective clause to variable ratio is (2-p)l/n Plot prob(satisfiable) and search cost against (2-p)l/n

55 NAE to SAT 10/7/2014

56 The real world isn’t random? Very true! Can we identify structural features common in real world problems? Consider graphs met in real world situations  social networks  electricity grids  neural networks ...

57 10/7/2014 Real versus Random Real graphs tend to be sparse  dense random graphs contains lots of (rare?) structure Real graphs tend to have short path lengths  as do random graphs Real graphs tend to be clustered  unlike sparse random graphs L, average path length C, clustering coefficient (fraction of neighbours connected to each other, cliqueness measure) mu, proximity ratio is C/L normalized by that of random graph of same size and density

58 10/7/2014 Small world graphs Sparse, clustered, short path lengths Six degrees of separation  Stanley Milgram’s famous 1967 postal experiment  recently revived by Watts & Strogatz  shown applies to: actors database US electricity grid neural net of a worm...

59 10/7/2014 An example 1994 exam timetable at Edinburgh University  59 nodes, 594 edges so relatively sparse  but contains 10-clique less than 10^-10 chance in a random graph  assuming same size and density clique totally dominated cost to solve problem

60 10/7/2014 Small world graphs To construct an ensemble of small world graphs  morph between regular graph (like ring lattice) and random graph  prob p include edge from ring lattice, 1-p from random graph real problems often contain similar structure and stochastic components?

61 10/7/2014 Small world graphs ring lattice is clustered but has long paths random edges provide shortcuts without destroying clustering

62 10/7/2014 Small world graphs

63 10/7/2014 Small world graphs

64 10/7/2014 Colouring small world graphs

65 10/7/2014 Small world graphs Other bad news  disease spreads more rapidly in a small world Good news  cooperation breaks out quicker in iterated Prisoner’s dilemma

66 10/7/2014 Other structural features It’s not just small world graphs that have been studied High degree graphs  Barbasi et al’s power-law model Ultrametric graphs  Hogg’s tree based model Numbers following Benford’s Law  1 is much more common than 9 as a leading digit! prob(leading digit=i) = log(1+1/i)  such clustering, makes number partitioning much easier

67 10/7/2014 High degree graphs Degree = number of edges connected to node Directed graph  Edges have a direction  E.g. web pages = nodes, links = directed edges In-degree, out-degree  In-degree = links pointing to page  Out-degree = links pointing out of page

68 10/7/2014 In-degree of World Wide Web Power law distribution  Pr(in-degree = k) = ak^-2.1 Some nodes of very high in-degree  E.g. google.com, …

69 10/7/2014 Out-degree of World Wide Web Power law distribution  Pr(in-degree = k) = ak^-2.7 Some nodes of very high out-degree  E.g. people in SAT

70 10/7/2014 High degree graphs World Wide Web Electricity grid Citation graph  633,391 out of 783,339 papers have < 10 citations  64 have > 1000 citations  1 has 8907 citations Actors graph  Robert Wagner, Donald Sutherland, …

71 10/7/2014 High degree graphs Power law in degree distribution  Pr(degee = k) = ak^-b where b typically around 3 Compare this to random graphs  Gnm model n nodes, m edges chosen uniformly at random  Gnp model n nodes, each edge included with probability p  In both, Pr(degree = k) is a Poisson distribution tightly clustered around mean

72 10/7/2014 Random v high degree graphs

73 10/7/2014 Generating high-degree graphs Grow graph Preferentially attach new nodes to old nodes according to their degree  Prob(attach to node j) proportional to degree of node j  Gives Prob(degree = k) = ak^-3

74 10/7/2014 High-degree = small world? Preferential attachment model  n=16, mu=1  n=64, mu=1.35  n=256, mu=2.12  … Small world topology thus for large n!

75 10/7/2014 Search on high degree graphs Random  Uniformly hard Small world  A few long runs High degree  More uniform  Easier than random

76 10/7/2014 What about numbers? So far, we’ve looked at structural features of graphs Many problems contain numbers  Do we see phase transitions here too?

77 10/7/2014 Number partitioning What’s the problem?  dividing a bag of numbers into two so their sums are as balanced as possible What problem instances?  n numbers, each uniformly chosen from (0,l ]  other distributions work (Poisson, …)

78 10/7/2014 Number partitioning Identify a measure of constrainedness  more numbers => less constrained  larger numbers => more constrained  could try some measures out at random (l/n, log(l)/n, log(l)/sqrt(n), …) Better still, use kappa!  (approximate) theory about constrainedness  based upon some simplifying assumptions e.g. ignores structural features that cluster solutions together

79 10/7/2014 Theory of constrainedness Consider state space searched  see 10-d hypercube opposite of 2^10 possible partitions of 10 numbers into 2 bags Compute expected number of solutions,  independence assumptions often useful and harmless!

80 10/7/2014 Theory of constrainedness Constrainedness given by: kappa= 1 - log2( )/n where n is dimension of state space kappa lies in range [0,infty)  kappa=0, =2^n, under-constrained  kappa=infty, =0, over-constrained  kappa=1, =1, critically constrained phase boundary

81 10/7/2014 Phase boundary Markov inequality  prob(Sol) Now, kappa > 1 implies < 1 Hence, kappa > 1 implies prob(Sol) < 1 Phase boundary typically at values of kappa slightly smaller than kappa=1  skew in distribution of solutions (e.g. 3-SAT)  non-independence

82 10/7/2014 Examples of kappa 3-SAT  kappa = l/5.2n  phase boundary at kappa= COL  kappa = e/2.7n  phase boundary at kappa=0.84 number partitioning  kappa = log2(l)/n  phase boundary at kappa=0.96

83 10/7/2014 Number partition phase transition Prob(perfect partition) against kappa

84 10/7/2014 Finite-size scaling Simple “trick” from statistical physics  around critical point, problems indistinguishable except for change of scale given by simple power-law Define rescaled parameter  gamma = kappa-kappa c. n^1/v kappa c  estimate kappa c and v empirically e.g. for number partitioning, kappa c =0.96, v=1

85 10/7/2014 Rescaled phase transition Prob(perfect partition) against gamma

86 10/7/2014 Rescaled search cost Optimization cost against gamma

87 10/7/2014 Easy-Hard-Easy? Search cost only easy-hard here?  Optimization not decision search cost!  Easy if (large number of) perfect partitions  Otherwise little pruning (search scales as 2^0.85n) Phase transition behaviour less well understood for optimization than for decision  sometimes optimization = sequence of decision problems (e.g branch & bound)  BUT lots of subtle issues lurking?

88 Looking inside search 10/7/2014 Clauses/variables down search branch

89 Looking inside search 10/7/2014 Clauses/variables down search branch

90 Looking inside search 10/7/2014 Clauses length down search branch

91 Constrainedness knife-edge 10/7/2014 kappa down search branch

92 Constrainedness knife-edge 10/7/2014 kappa down search branch

93 Constrainedness knife-edge 10/7/2014 Real world register allocation graph colouring problem

94 Constrainedness knife-edge 10/7/2014 Optimisation problems too (number partitioning)

95 Exploiting the knife-edge Get off the knife-edge asap  Aka minize constrainedness Many existing heuristics can be viewed in this light  E.g. fail first heuristic in CSPs  E.g. KK heuristic for number partitioning  … 10/7/2014

96 Exploiting the knife-edge Get off the knife-edge asap  Aka minize constrainedness Many existing heuristics can be viewed in this light  E.g. fail first heuristic in CSPs  E.g. KK heuristic for number partitioning  … Good way to design new heuristics  Branch into subproblem with minimal kappa  Challenge: to compute this efficiently! 10/7/2014

97 The future? What open questions remain? Where to next?

98 10/7/2014 Open questions Prove random 3-SAT occurs at l/n = 4.3  random 2-SAT proved to be at l/n = 1  random 3-SAT transition proved to be in range 3.42 < l/n < p-COL  Prove problem changes around p=0.8  What happens to colouring backbone?

99 10/7/2014 Open questions Does phase transition behaviour give insights to help answer P=NP?  it certainly identifies hard problems!  problems like 2+p-SAT and ideas like backbone also show promise But problems away from phase boundary can be hard to solve over-constrained 3-SAT region has exponential resolution proofs under-constrained 3-SAT region can throw up occasional hard problems (early mistakes?)

100 10/7/2014 Summary That’s nearly all from me!

101 10/7/2014 Conclusions Phase transition behaviour ubiquitous  decision/optimization/...  NP/PSpace/P/…  random/real Phase transition behaviour/constrainedness gives insight into problem hardness  suggests new branching heuristics  ideas like the backbone help understand branching mistakes

102 10/7/2014 Conclusions AI becoming more of an experimental science?  theory and experiment complement each other well  increasing use of approximate/heuristic theories to keep theory in touch with rapid experimentation Phase transition behaviour is FUN  lots of nice graphs as promised  and it is teaching us lots about complexity and algorithms!

103 10/7/2014 Very partial bibliography Cheeseman, Kanefsky, Taylor, Where the really hard problem are, Proc. of IJCAI-91 Gent et al, The Constrainedness of Search, Proc. of AAAI-96 Gent & Walsh, The TSP Phase Transition, Artificial Intelligence, 88: , 1996 Gent & Walsh, Analysis of Heuristics for Number Partitioning, Computational Intelligence, 14 (3), 1998 Gent & Walsh, Beyond NP: The QSAT Phase Transition, Proc. of AAAI-99 Gent et al, Morphing: combining structure and randomness, Proc. of AAAI-99 Hogg & Williams (eds), special issue of Artificial Intelligence, 88 (1-2), 1996 Mitchell, Selman, Levesque, Hard and Easy Distributions of SAT problems, Proc. of AAAI-92 Monasson et al, Determining computational complexity from characteristic ‘phase transitions’, Nature, 400, 1998 Walsh, Search in a Small World, Proc. of IJCAI-99 Walsh, Search on High Degree Graphs, Proc. of IJCAI Walsh, From P to NP: COL, XOR, NAE, 1-in-k, and Horn SAT, Proc. of AAAI Watts & Strogatz, Collective dynamics of small world networks, Nature, 393, 1998

104 Some blatent adverts! 2nd International Optimisation Summer School Jan 12th to 18th, Kioloa, NSW Will also cover local search NICTA Optimisation Research Group 10/7/2014 We love having visitors stop by to give talks, or for longer (week, month or sabbaticals!)


Download ppt "10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW"

Similar presentations


Ads by Google