Presentation is loading. Please wait.

Presentation is loading. Please wait.

Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010.

Similar presentations


Presentation on theme: "Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010."— Presentation transcript:

1 Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010

2 Part I Understanding computational complexity beyond worst-case complexity –Benchmarks: The role of Random Distributions Random SAT –Typical Case Analysis vs. Worst Case Complexity analysis – phase transition phenomena Part II Understanding runtime distributions of complete search methods –Heavy and Fat-Tailed Phenomena in combinatorial search and Restart strategies Understanding tractable sub-structure –Backdoors and Tractable sub-structure –Formal Models of Heavy-tails and Backdoors –Performance of current state-of-the art solvers on real-world structured problems exploiting backdoors Big Picture of Topics Covered in this talk

3 II - Understanding runtime distributions of complete search methods

4 Outline Complete randomized backtrack search methods Runtime distributions of complete randomized backtrack search methods

5 Complete Randomized Backtrack search methods

6 Exact / Complete Backtrack Methods Main Underlying (Search) Mechanisms in: Mathematical Programming (MP) Constraint Programming (CP) Satisfiability  Backtrack Search; Branch & Bound; Branch & Cut; Branch & Price; Davis-Putnam-Logemann-Lovelan Proc.(DPLL) …

7 Branch and Bound 1 23 x 1 = 0x 1 = 1 44 3/7 44 44 3/7 4 x 2 = 0 42 No further searching from node 4 because there cannot be a better integer solution. 1 4 (0; 0; 1; 0;1) 1;3/7;0;0;0;1 (0; 1; ¼:0;0;1) maximize 16x1 + 22x2 + 12x3 + 8x4 +11x5 + 19x6 subject to 5x1 + 7x2 + 4x3 + 3x4 +4x5 + 6x6  14 x j binary for j = 1 to 6 0-1 knapsack

8 1 23 x 1 = 0x 1 = 1 44 4 x 2 = 0 42 5 x 2 = 1 6 x 2 = 0 7 x 2 = 1 44 8 x 3 = 0 9 x 3 = 1 10 x 3 = 0 11 x 3 = 1 12 x 3 = 0 13 x 3 = 1 43 44 -- 891011 -- -- 1415 1617 44 1819 -- 38 maximize 16x1 + 22x2 + 12x3 + 8x4 +11x5 + 19x6 subject to 5x1 + 7x2 + 4x3 + 3x4 +4x5 + 6x6  14 x j binary for j = 1 to 6

9 Backtrack Search - Satisfiability State-of-the-art complete solvers are based on backtrack search procedures (typically with unit-propagation, learning, randomization, restarts); ( a OR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)

10 Randomization in Complete Backtrack Search Methods

11 Motivation : Randomization in Local Search The use of randomization has been very successful in the area of local search or meta heuristics. Simulated annealing Genetic algorithms Tabu Search Gsat, Walksat and variants. Limitation: inherent incomplete nature of local search methods – cannot prove optimality or inconsistency.

12 Randomized Backtrack Search Goal: Explore the addition of a stochastic element into a systematic search procedure without losing completeness. What if the we introduce an element of randomness into a complete backtrack search method?

13 Several ways of introducing randomness into a backtrack search method: simple way  randomly breaking ties in variable and/or value selection. general framework  imposing a probability distribution for value/value selection or other search parameters; Compare with standard lexicographic tie-breaking. Note: with simple book-keeping we can maintain the completeness of the backtrack search method; Randomized Backtrack Search

14 Notes on Randomizing Backtrack Search Lots of opportunities to introduce randomization  basically at different decisions points of backtrack search: –Variable/value selection –Look-ahead / look-back procedures –E.g.: When and how to perform domain reduction/propagation What cuts to add; –Target backtrack points –Restarts Not necessarily tie breaking only  more generally we can define a probability distribution over the set of possible choices at a given decision point

15 Walsh 99 Notes on Randomizing Backtrack Search (cont). Can we replay a “randomized” run?  yes since we use pseudo random numbers; if we save the “seed”, we can then repeat the run with the same seed; “Deterministic randomization” (Wolfram 2002) – the behavior of some very complex deterministic systems is so unpredictable that it actually appears to be random (e.g., adding nogoods or cutting constraints between restarts  used in the satisfiability community) What if we cannot randomized the code? Randomize the input – Randomly rename the variables (Motwani and Raghavan 95) (Walsh (99) applied this technique to study the runtime distributions of graph-coloring using a deterministic algorithm based on DSATUR implemented by Trick)

16 Runtime Distributions of Complete Randomized Backtrack search methods

17 Backtrack Search Two Different Executions ( a OR NOT b OR NOT c ) AND ( b OR NOT c) AND ( a OR c)

18 Size of Search Trees in Backtrack Search The size of the search tree varies dramatically, depending on the order in which we pick the variables to branch on  Important to choose good heuristics for variable/value selection;

19 Runtime distributions of Complete Randomized Backtrack search methods When solving instances of a combinatorial problem such as the Satisfiability problem or an Integer Program using a complete randomized search method such as backtrack search or branch and bound - the run time of the randomized backtrack search method, running on single individual instances (i.e.,several runs of the same complete randomized procedure on the same instance) exhibits very high variance.

20 Randomized Backtrack Search (*) no solution found - reached cutoff: 2000 Time:(*)3011(*) Latin Square (Order 4)

21 Median = 1! sample mean 3500! Erratic Behavior of Sample Mean 500 2000 number of runs

22 Heavy-Tailed Distributions … infinite variance … infinite mean Introduced by Pareto in the 1920’s --- “probabilistic curiosity.” Mandelbrot established the use of heavy-tailed distributions to model real-world fractal phenomena. Examples: stock-market, earth-quakes, weather,...

23 The Pervasiveness of Heavy-Tailed Phenomena in Economics. Science, Engineering, and Computation Tsunami 2004 Blackout of August 15th 2003 > 50 Million People Affected Financial Markets with huge crashes … there are a few billionaires Backtrack search Annual meeting (2005).b

24 Standard Distribution (finite mean & variance) Power Law Decay Exponential Decay

25 Decay of Heavy-tailed Distributions Standard --- Exponential Decay e.g. Normal: Heavy-Tailed --- Power Law Decay e.g. Pareto-Levy:

26 Normal, Cauchy, and Levy Normal - Exponential Decay Cauchy -Power law Decay Levy -Power law Decay

27 Tail Probabilities (Standard Normal, Cauchy, Levy)

28 Fat tailed distributions Kurtosis = second central moment (i.e., variance) fourth central moment Normal distribution  kurtosis is 3 Fat tailed distribution  when kurtosis > 3 (e.g., exponential, lognormal)

29 Fat and Heavy-tailed distributions Exponential decay for standard distributions, e.g. Normal, Logonormal, exponential: Heavy-Tailed Power Law Decay e.g. Pareto-Levy: Normal 

30 Pareto Distribution Density Function f(x) = P[ X = x ] –f(x) =  / x (  + 1) for x  1 Distribution Function F(x) = P[ X  x ] –F(x) = 1 - 1 / x  for x  1 Survival Function (Tail probability S(x) = 1 – F(x) = P[X>x] –S(x) = 1 / x  for x  1 where  > 0 is a shape parameter

31 Pareto Distribution Moments E(X n ) =  / (  - n) if n <  E(X n ) =  if n  . Mean  E(X) =  / (  - 1) if  > 1. E(X) =  if   1. Variance  var(X) =  / [(  - 1) 2 (  - 2)] if  > 2 var(X) =  if   2.

32 Pareto  =1 X f(x) X F(x) = P[X ≤ x] Infinite mean and infinite variance; median = 2;

33 Pareto  =2 f(x) X F(x) X Finite mean and infinite variance; median = 1.414

34 Pareto  =3 Finite mean and finite variance, infinite 3 rd and higher moments ;

35 How to Visually Check for Heavy-Tailed Behavior Log-log plot of tail of distribution exhibits linear behavior.

36 How to Check for “Heavy Tails”? Log-Log plot of tail of distribution should be approximately linear. Slope gives value of infinite mean and infinite variance infinite mean and infinite variance infinite variance infinite variance

37 Pareto  =1 Lognormal 1,1 X f(x) Infinite mean and infinite variance. Lognormal(1,1) Pareto(1)

38 Survival Function: Pareto and Lognormal

39 Example of Heavy Tailed Model Random Walk: Start at position 0 Toss a fair coin: with each head take a step up (+1) with each tail take a step down (-1) X --- number of steps the random walk takes to return to position 0.

40 The record of 10,000 tosses of an ideal coin (Feller) Zero crossing Long periods without zero crossing

41 Random Walk Heavy-tails vs. Non-Heavy-Tails Normal (2,1000000) Normal (2,1) O,1%>200000 50% 2 Median=2 1-F(x) Unsolved fraction X - number of steps the walk takes to return to zero (log scale)

42 Number backtracks (log) (1-F(x))(log) Unsolved fraction => Infinite mean Heavy-Tailed Behavior in Quasigroup Completion Problem Domain 18% unsolved 0.002% unsolved

43 To Be or Not To Be Heavy-Tailed Gomes, Fernandez, Selman, Bessiere – CP 04

44 (1-F(x))(log) Unsolved fraction => Infinite mean Heavy-Tailed Behavior in Quasigroup Completion Problem Domain 18% unsolved 0.002% unsolved Number backtracks (log)

45 Research Questions: 1.Can we provide a characterization of heavy-tailed behavior: when it occurs and it does not occur? 2.Can we identify different tail regimes across different constrainedness regions? 3.Can we get further insights into the tail regime by analyzing the concrete search trees produced by the backtrack search method? Concrete CSP Models Complete Randomized Backtrack Search

46 Scope of Study Random Binary CSP Models Encodings of CSP Models Randomized Backtrack Search Algorithms Search Trees Statistical Tail Regimes Across Constrainedness Regions –Empirical Results –Theoretical Model

47 Binary Constraint Networks A finite binary constraint network P = (X, D,C) –a set of n variables X = {x 1, x 2, …, x n } –For each variable, set of finite domains D = { D(x 1 ), D(x 2 ), …, D(x n )} –A set C of binary constraints between pairs of variables; a constraint C ij, on the ordered set of variables (x i, x j ) is a subset of the Cartesian product D(x i ) x D(x j ) that specifies the allowed combinations of values for the variables x i and x j. –Solution to the constraint network instantiation of the variables such that all constraints are satisfied.

48 Random Binary CSP Models Model B N – number of variables; D – size of the domains; c – number of constrained pairs of variables; p1 – proportion of binary constraints included in network ; c = p 1 N ( N-1)/ 2; t – tightness of constraints; p 2 - proportion of forbidden tuples; t = p 2 D 2 Model E N – number of variables; D – size of the domains: p – proportion of forbidden pairs (out of D 2 N ( N-1)/ 2) (Achlioptas et al 2000) (Gent et al 1996) N – from 15 to 50; (Xu and Li 2000)

49 Typical Case Analysis: Beyond NP-Completeness Constrainedness Computational Cost (Mean) % of solvable instances Phase Transition Phenomenon: Discriminating “easy” vs. “hard” instances Hogg et al 96

50 Encodings Direct CSP Binary Encoding Satisfiability Encoding (direct encoding)

51 Backtrack Search Algorithms Look-ahead performed:: –no look-ahead (simple backtracking BT); –removal of values directly inconsistent with the last instantiation performed (forward-checking FC); –arc consistency and propagation (maintaining arc consistency, MAC). Different heuristics for variable selection (the next variable to instantiate): –Random (random); –variables pre-ordered by decreasing degree in the constraint graph (deg); –smallest domain first, ties broken by decreasing degree (dom+deg) Different heuristics for variable value selection: –Random –Lexicographic For the SAT encodings we used the simplified Davis-Putnam-Logemann- Loveland procedure: Variable/Value static and random

52 Inconsistent Subtrees

53 Distributions Runtime distributions of the backtrack search algorithms; Distribution of the depth of the inconsistency trees found during the search; All runs were performed without censorship.

54 Main Results 1 - Runtime distributions 2 – Inconsistent Sub-tree Depth Distributions Dramatically different statistical regimes across the constrainedness regions of CSP models;

55 Runtime distributions

56 Distribution of Depth of Inconsistent Subtrees

57 Depth of Inconsistent Search Tree vs. Runtime Distributions

58 Other Models and More Sophisticated Consistency Techniques BTMAC Heavy-tailed and non-heavy-tailed regions. As the “sophistication” of the algorithm increases the heavy-tailed region extends to the right, getting closer to the phase transition Model B

59 SAT encoding: DPLL

60 To Be or Not To Be Heavy-tailed: Summary of Results 1 As constrainedness increases change from heavy-tailed to a non-heavy-tailed regime Both models (B and E), CSP and SAT encodings, for the different backtrack search strategies:

61 2 Threshold from the heavy-tailed to non-heavy- tailed regime –Dependent on the particular search procedure; –As the efficiency of the search method increases, the extension of the heavy-tailed region increases: the heavy-tailed threshold gets closer to the phase transition. To Be or Not To Be Heavy-tailed: Summary of Results

62 3 Distribution of the depth of inconsistent search sub-trees Exponentially distributed inconsistent sub-tree depth (ISTD) combined with exponential growth of the search space as the tree depth increases implies heavy-tailed runtime distributions. As the ISTD distributions move away from the exponential distribution, the runtime distributions become non-heavy- tailed. To Be or Not To Be Heavy-tailed: Summary of Results Theoretical model fits data nicely!

63 Theoretical Model

64 Depth of Inconsistent Search Tree vs. Runtime Distributions Theoretical Model X – search cost (runtime); ISTD – depth of an inconsistent sub-tree; P istd [ISTD = N]– probability of finding an inconsistent sub-tree of depth N during search; P[X>x | ISTD=N] – probability of the search cost being larger x, given an inconsistent tree of depth N

65 Depth of Inconsistent Search Tree vs. Runtime Distributions:Theoretical Model See paper for proof details

66 Regressions for B1, B2, K Regression for B1 and B2Regression for k

67 Validation: Theoretical Model vs. Runtime Data α= 0.26  using the model;α= 0.27  using runtime data;

68 To Be or Not To Be Heavy-tailed: Summary of Results 1 As constrainedness increases change from heavy-tailed to a non-heavy-tailed regime Both models (B and E), CSP and SAT encodings, for the different backtrack search strategies:

69 2 Threshold from the heavy-tailed to non-heavy- tailed regime –Dependent on the particular search procedure; –As the efficiency of the search method increases, the extension of the heavy-tailed region increases: the heavy-tailed threshold gets closer to the phase transition. To Be or Not To Be Heavy-tailed: Summary of Results

70 3 Distribution of the depth of inconsistent search sub-trees Exponentially distributed inconsistent sub-tree depth (ISTD) combined with exponential growth of the search space as the tree depth increases implies heavy-tailed runtime distributions. As the ISTD distributions move away from the exponential distribution, the runtime distributions become non-heavy- tailed. To Be or Not To Be Heavy-tailed: Summary of Results

71 Exploiting Heavy-Tailed Behavior: Restarts

72 Fat and Heavy Tailed behavior has been observed in several domains: Quasigroup Completion Problems; Graph Coloring; Planning; Scheduling; Circuit synthesis; Decoding, etc.

73 How to avoid the long runs? Use restarts or parallel / interleaved runs to exploit the extreme variance performance. Restarts provably eliminate heavy-tailed behavior. (Gomes et al. 97,98,2000)

74 XXXXX solved 10 Sequential: 50 +1 = 51 seconds Parallel: 10 machines --- 1 second 51 x speedup Super-linear Speedups Interleaved (1 machine): 10 x 1 = 10 seconds 5 x speedup

75 Restarts 70% unsolved 1-F(x) Unsolved fraction Number backtracks (log) no restarts restart every 4 backtracks 250 (62 restarts) 0.001% unsolved

76 Example of Rapid Restart Speedup (planning) 20 2000 ~100 restarts Cutoff (log) Number backtracks (log) ~10 restarts 100000

77 Sketch of proof of elimination of heavy tails Let’s truncate the search procedure after m backtracks. Probability of solving problem with truncated version: Run the truncated procedure and restart it repeatedly.

78 Y - does not have Heavy Tails

79 Restart Strategies Restart with increasing cutoff - e.g., used by the Satisfiability community; cutoff increases linearly: Randomized backtracking – (Lynce et al 2001)  randomizes the target decision points when backtracking (several variants) Random jumping (Zhang 2002)  the solver randomly jumps to unexplored portions of the search space; jumping decisions are based on analyzing the ratio between the space searched vs. the remaining search space; solved several open problems in combinatorics; Geometric restarts – (Walsh 99) – cutoff is increased geometrically; Learning restart strategies – (Kautz et al 2001 and Ruan et. al 2002) – results on optimal policies for restarts under particular scenarios. Huge area for further research. Universal restart strategies (Luby et al 93) – seminal paper on optimal restart strategies for Las Vegas algorithms (theoretical paper) Current state art sat solvers use restarts!!!

80 III - Understanding Tractable Sub-Structure in Combinatorial Problems

81 Backdoors

82 Defying NP-Completeness Current state of the art complete or exact solvers can handle very large problem instances of hard combinatorial :  We are dealing with formidable search spaces of exponential size --- to prove optimality we have to implicitly search the entire search ;  the problems we are able to solve are much larger than would predict given that such problems are in general NP complete or harder Example – a random unsat 3-SAT formula in the phase transition region with over 1000 variables cannot be solved while real-world sat and unsat instances with over 100,000 variables are solved in a few minutes.

83 A “real world” example

84 i.e. ((not x 1 ) or x 7 ) and ((not x 1 ) or x 6 ) and … etc. Bounded Model Checking instance:

85 (x 177 or x 169 or x 161 or x 153 … or x 17 or x 9 or x 1 or (not x 185 )) clauses / constraints are getting more interesting… 10 pages later: …

86 4000 pages later: … !!! a 59-cnf clause…

87 Finally, 15,000 pages later: The Chaff SAT solver (Princeton) solves this instance in less than one minute. Note that:… !!! What makes this possible?

88 Inference and Search – Inference at each node of search tree: –MIP uses LP relaxations and cutting planes; –CP and SAT - domain reduction constraint propagation and no-good learning. Search Different search enhancements in terms of variable and value selection strategies, probing, randomization etc, while guaranteeing the completeness of the search procedure.

89 Tractable Problem Sub-structure Real World Problems are also characterized by Hidden tractable substructure in real-world problems. Can we make this more precise? We consider particular structures we call backdoors.

90 Backdoors

91 BACKDOORS Subset of “critical” variables such that once assigned a value the instance simplifies to a tractable class. Real World Problems are characterized by Hidden Tractable Substructure Backdoors: intuitions Explain how a solver can get “lucky” and solve very large instances

92 Backdoors to tractability Informally: A backdoor to a given problem is a subset of its variables such that, once assigned values, the remaining instance simplifies to a tractable class (not necessarily syntactically defined). Formally: We define notion of a “sub-solver” (handles tractable substructure of problem instance) Backdoors and strong backdoors

93 Backdoors: General Idea The notion of backdoor provides a tool for the analysis of performance of current solvers Initial Goal  explain why current state of the art backtrack search solvers, in particular SAT solvers, with restarts, perform so well on many real world instances – what is it about the real-world instances that makes them amenable to tractable behavior by state-of-the-art solvers? (Note: our definition uses the notion of poly sub-solver, encompassing CSP, Sat and MIP solvers)

94 Defining a sub-solver

95 Note on Definition of Sub-solver Definition is general enough to encompass any polynomial time propagation methods used by state of the art solvers: –Unit propagation –Arc consistency –ALLDIFF –Linear programming –… –Any polynomial time solver Definition is also general to include even polytime solvers for which there does not exist a clean syntactical characterization of the tractable subclass. Applies to CSP, SAT, MIP, etc

96 Backdoors (for satisfiable instances): Strong backdoors (apply to satisfiable or inconsistent instances): Defining backdoors Given a combinatorial problem C

97 Backdoors vs. Cutsets and Related Concepts

98 Cutsets, Induced Width, and Related Concepts Rich body of work providing elegant sufficient conditions for backtrack-free search algorithms based on the topology of the underlying graph and the level of consistency performed. Tractability by restricted structure– identification of tractable classes of constraint satisfaction problems based solely on the structure of the constraint graph, independently of the semantics of the constraints; Key graph concept  induced width (Dechter 2003 )

99 Example: Cycle-cutset Given an undirected graph, a cycle cutset is a subset of nodes in the graph whose removal results in a graph without cycles Once the cycle-cutset variables are instantiated, the remaining problem is a tree  solvable in polynomial time using arc consistency; A constraint graph whose graph has a cycle-cutset of size c can be solved in time of O((n-c) k (c+2) ) Important: verifying that a set of nodes is a cutset (or a b- cuteset) can be done in polynomial time (in number of nodes). (Dechter 93 )

100 B Cutset variable Clique of size k  cutset of size k-2;

101 Backdoors Can be viewed as a generalization of cutsets; Backdoors use a general notion of tractability based on a polytime sub-solver --- backdoors do not require a syntactic characterization of tractability. Backdoors factor in the semantics of the constraints wrt sub-solver and values of the variables; Backdoors apply to different representations, including different semantics for graphs, e.g., network flows --- CSP, SAT, MIP, etc; (Dechter 93 ) Note: Cutsets and W-cutsets – tractability based solely on the structure of the constraint graph, independently of the semantics of the constraints;

102 Backdoors --- “seeing is believing” Logistics_b.cnf planning formula. 843 vars, 7,301 clauses, approx min backdoor 16

103 Logistics.b.cnf after setting 5 backdoor vars (result after propagation; large cutsets);

104 After setting just 12 (out of 800+) backdoor vars – problem almost solved.

105 Inductive inference problem --- ii16a1.cnf. 1650 vars, 19,368 clauses. Backdoor size 40.

106 After setting 6 backdoor vars.

107 After setting 38 (out of 1600+) backdoor vars: Some other intermediate stages: So: Real-world structure hidden in the network. Related to small-world networks etc.

108 Backdoors: Generalization of cutsets and w-cutsets (binary CSPs) A Cutset is a Backdoor Set (with respect to arc consistency) Is a Backdoor Set always a Cutset ? NO; Backdoor sets can be significantly maller than cutsets.           = = Backdoor set (which is not a cutset) Examples Key aspect – backdoors factor in the propagation triggered by the polytime solver Ex 1 Cutset variable

109 Backdoors: Generalization of cutsets and w-cutsets (binary CSPs) Horn theory with n variables – backdoor size 0 (unit propagation); cutset – can be of size n-2; Key aspect – notion of backdoor factors in values of variables and semantics of constraints (via the propagation triggered by the polytime solver) Example 2

110 Backdoors: How the concept came about Backdoors – The notion came about from an abstract formal model built to explain the high variance in performance of current state-of-the-art solvers in particular heavy-tailed behavior and in our quest to understand the behavior of real solvers (propagation mechanisms, “sub-solvers” are key); Emphasis not so much on proving that a set of variables is a backdoor (or that it's easy to find), but rather on the fact that if we have a (small) set of variables that is a backdoor set, then, once the variables are assigned a value, the polytime solver will solve the resulting formula it in polytime. Surprisingly, real solvers are very good at finding small backdoors!

111 Backdoors: Quick detection of inconsistencies Detecting inconsistencies quickly --- in logical reasoning the ability to detect global inconsistency based on local information is very important, in particular in backtrack search (global solution); Tractable substructure helps in recognizing quickly global inconsistency --- backdoors exploit the existence of sub- structures that are sufficient to proof global inconsistency properties; How does this help in solving sat instances? By combining it with backtrack search, as we start setting variables the sub- solver quickly recognizes inconsistencies and backtracks.

112 Formal Models: On the connections between backdoors and heavy-tailedness

113  Explain very long runs of complete solvers;  But also imply the existence of a wide range of solution times, often from very short runs to very long How to explain short runs? Fat and Heavy-tailed distributions Backdoors

114 T - the number of leaf nodes visited up to and including the successful node; b - branching factor Formal Model Yielding Heavy-Tailed Behavior b = 2 (Gomes 01; Chen, Gomes, and Selman 01) Trade-off: exponential decay in making wrong branching decisions with exponential growth in cost of mistakes. (inspired by work in information theory, Berlekamp et al. 1972) 1 backdoor p –probability of not finding the backdoor

115 Expected Run Time (infinite expected time) Variance (infinite variance) Tail (heavy-tailed) p –probability of not finding the backdoor

116 More than 1 backdoor (Williams, Gomes, Selman 03)

117 Backdoors provide detailed formal model for heavy-tailed search behavior. Can formally relate size of backdoor and strength of heuristics (captured by its failure probability to identify backdoor variables) to occurrence of heavy tails in backtrack search.

118 Backdoors in real-world problems instances

119 Backdoors can be surprisingly small: Backdoors explain how a solver can get “lucky” on certain runs, when the backdoors are identified early on in the search. (large cutsets)

120 Synthetic Plannnig Domains Synthetic domains, carefully crafted families of formulas: as simple as possible enabling a full rigorous analysis rich enough to provide insights into real-world domains Research questions – the relationship between problem structure, semantics of backdoors, backdoors size, and problem hardness. Hoffmann, Gomes, Selman 2005

121 Synthetic Planning Domains Three Synthetic Domains: Structured Pigeon Hole (SPH n k );  backdoor set O(n) Synthetic Logistics Map Domain (MAP n k );  backdoor set O(log n) Synthetic Blocks World (SBW n k );  backdoor set O(log n) Each family is characterized by size (n) and a structure parameter (k); Focus DPLL – unit propagation; Strong backdoors (for proving unsatisfiability) Hoffmann, Gomes, Selman 2005

122 MAP n k – b and t - goal locations in the bottom case and top case respectively K  {1, 3, …, 2n-3}; k increase by steps of 2; the goal in branch 1 increases by two steps, one of the other goals is skipped Synthetic Logistics Map Domain (MAP n k )

123 L10L10 (...) L11L11 L21L21 Ln1Ln1 MAP 8 13 L10L10 L21L21 L11L11 L12L12 L 1 13 … L13L13 L16L16 L17L17 … backdoor set O(log n) backdoor set O(n 2 ) Cutset  (n 2 ) Number of Variables O(n 2 ) Note: the topology of the constraint graphs is identical for both cases. Size of cutset is of same order for both cases.

124 L10L10 (...) L11L11 L21L21 Ln1Ln1 Hoffmann, Gomes, Selman 2005 Synthetic Logistics Map Domain: Bottom Case (MAP n 1 )

125 MAP 8 13 L10L10 L21L21 L11L11 L12L12 L 1 13 … L13L13 L16L16 L17L17 … Hoffmann, Gomes, Selman 2005 Synthetic Logistics Map Domain Top Case: (MAP 8 13 )

126 Semantics of Backdoors Consider G the set of goals in the planning problem; let’s define: AsymRatio  (0,1] Intuition – if there is a sub-goal that requires more resources than the other sub-goals  main reason for unsatisfiability  the larger the ratio the easier it is to detect inconsistency Hoffmann, Gomes, Selman 2005

127 L10L10 (...) L11L11 L21L21 Ln1Ln1 MAP 8 13 L10L10 L21L21 L11L11 L12L12 L 1 13 … L13L13 L16L16 L17L17 … backdoor set O(log n) backdoor set O(n 2 ) Cutset  (n 2 ) Number of Variables O(n 2 ) Note: the topology of the constraint graphs is identical for both cases. Size of cutset is of same order for both cases.

128 Asym Ratio – “Rovers” Domain (Simplified version of a NASA space application) As asymRatio increases, the hardness decreases (Conjecture - Smaller backdoors) Similar results for other domains: Depots, Driverlog, Freecell,Zenotravel

129 MAP-6-7.cnf infeasible planning instances. Strong backdoor of size 3. 392 vars, 2,578 clauses.

130 After setting 2 (out of 392) backdoor vars. --- complexity almost gone…

131 Map 5 Top: running without backdoor

132 Map 5 Top: running with “a” backdoor (size 9 – not minimum)

133 Initial Graph Graph after setting 1 backdoor variable Graph after setting 2 backdoor variables In this graph one single variable is enough to proof inconsistency (with unit propagation) After setting three backdoor variables Map 5 Top: running with minimum backdoor (size 3)

134 Map 5 Top: running with backdoor (minimum – size 3) Initial Graph After setting two backdoors After setting three backdoors After setting one backdoor

135 Exploiting Backdoors Williams, Gomes, Selman 03/04

136 Algorithms We cover three kinds of strategies for dealing with backdoors: A complete deterministic algorithm A complete randomized algorithm Provably better performance over the deterministic one A heuristicly guided complete randomized algorithm Assumes existence of a good heuristic for choosing variables to branch on We believe this is close to what happens in practice

137 Deterministic Generalized Iterative Deepening

138 Generalized Iterative Deepening x 1 = 0x 1 = 1 All possible trees of depth 1 x 2 = 0x 2 = 1 (…) x n = 0x n = 1

139 Generalized Iterative Deepening Level 2 x 1 = 0x 1 = 1 x 2 = 0 x 2 = 1 x 2 = 0 x 2 = 1 All possible trees of depth 2

140 Generalized Iterative Deepening Level 2 x n-1 = 0X n-1 = 1 x n = 0 x n = 1 x n = 0 x n = 1 Level 3, level 4, and so on … All possible trees of depth 2

141 Randomized Generalized Iterative Deepening Assumption: There exists a backdoor whose size is bounded by a function of n (call it B(n)) Idea: Repeatedly choose random subsets of variables that are slightly larger than B(n), searching these subsets for the backdoor

142 Deterministic Versus Randomized Deterministic strategy Randomized strategy Suppose variables have 2 possible values (e.g. SAT) k For B(n) = n/k, algorithm runtime is c n c Det. algorithm outperforms brute-force search for k > 4.2

143 Complete Randomized Depth First Search with Heuristic Assume we have the following. DFS, a generic depth first search randomized backtrack search solver with: (polytime) sub-solver A Heuristic H that (randomly) chooses variables to branch on, in polynomial time  H has probability 1/h of choosing a backdoor variable (h is a fixed constant) Call this ensemble (DFS, H, A)

144 Polytime Restart Strategy for (DFS, H, A) Essentially: If there is a small backdoor, then (DFS, H, A) has a restart strategy that runs in polytime.

145 Runtime Table for Algorithms DFS,H,A B(n) = upper bound on the size of a backdoor, given n variables When the backdoor is a constant fraction of n, there is an exponential improvement between the randomized and deterministic algorithm

146 Exploiting Structure using Randomization: Summary Over the past few years, randomization has become a powerful tool to boost performance of complete ( exact ) solvers; Very exciting new research area with successful stories  E.g., state of the art complete Sat solvers use randomization. Very effective when combined with no-good learning

147 Stochastic search methods (complete and incomplete) have been shown very effective. Restart strategies and portfolio approaches can lead to substantial improvements in the expected runtime and variance, especially in the presence of fat and heavy-tailed phenomena – a way of taking advantage of backdoors and tractable sub-structure. Randomization is therefore a tool to improve algorithmic performance and robustness. Exploiting Randomization in Backtrack Search: Summary

148 Summary Research questions: Should we consider dramatically different algorithm design strategies leading to highly asymmetric distributions, with a good chance of short runs (even if that means also a good chance of long runs), that can be effectively exploited with restarts?

149 Summary Notion of a “backdoor” set of variables. Captures the combinatorics of a problem instance, as dealt with in practice. Provides insight into restart strategies. Backdoors can be surprisingly small in practice. Search heuristics + randomization can be used to find them, provably efficiently. Research Issues Understanding the semantics of backdoors

150 Unlikely that we would have discover such phenomena by pure mathematical thinking / modeling. Take home message: In order to understand real-world constrained problems and scale up solutions the principled experimentation plays a role as important as formal models – the empirical study of phenomena is a sine qua non for the advancement of the field. Scientific Use of Experimentation: Take Home Message Talk: described scientific experimentation applied to the study constrained problems has led us to the discovery of and understanding of interesting computational phenomena which in turn allowed us to better algorithm design.

151 The End ! www.cs.cornell.edu/gomes


Download ppt "Master Class on Experimental Study of Algorithms Scientific Use of Experimentation Carla P. Gomes Cornell University CPAIOR Bologna, Italy 2010."

Similar presentations


Ads by Google