Short XORs for Model Counting: From Theory to Practice Carla P. Gomes, Joerg Hoffmann, Ashish Sabharwal, Bart Selman Cornell University & Univ. of Innsbruck.

Slides:



Advertisements
Similar presentations
Sampling and Soundness: Can We Have Both? Carla Gomes, Bart Selman, Ashish Sabharwal Cornell University Jörg Hoffmann DERI Innsbruck …and I am: Frank van.
Advertisements

Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space Chris Beck Princeton University Joint work with Paul Beame & Russell.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Mar, 4, 2015 Slide credit: some slides adapted from Stuart.
1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University.
Dynamic Restarts Optimal Randomized Restart Policies with Observation Henry Kautz, Eric Horvitz, Yongshao Ruan, Carla Gomes and Bart Selman.
Counting Solution Clusters Divide-and-Conquer Recursively:  Arbitrarily pick a variable, say x, of formula F  Count how many clusters contain solutions.
1 Reinforcement Learning Introduction & Passive Learning Alan Fern * Based in part on slides by Daniel Weld.
Leveraging Belief Propagation, Backtrack Search, and Statistics for Model Counting Lukas Kroc, Ashish Sabharwal, Bart Selman Cornell University May 23,
CP Formal Models of Heavy-Tailed Behavior in Combinatorial Search Hubie Chen, Carla P. Gomes, and Bart Selman
Heavy-Tailed Behavior and Search Algorithms for SAT Tang Yi Based on [1][2][3]
1 Towards Efficient Sampling: Exploiting Random Walk Strategy Wei Wei, Jordan Erenrich, and Bart Selman.
Model Counting: A New Strategy for Obtaining Good Bounds Carla P. Gomes, Ashish Sabharwal, Bart Selman Cornell University AAAI Conference, 2006 Boston,
Phase Transitions of PP-Complete Satisfiability Problems D. Bailey, V. Dalmau, Ph.G. Kolaitis Computer Science Department UC Santa Cruz.
Evaluating Hypotheses
AAAI00 Austin, Texas Generating Satisfiable Problem Instances Dimitris Achlioptas Microsoft Carla P. Gomes Cornell University Henry Kautz University of.
1 Backdoors To Typical Case Complexity Ryan Williams Carnegie Mellon University Joint work with: Carla Gomes and Bart Selman Cornell University.
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Instance Hardness and Phase Transitions.
CP-AI-OR-02 Gomes & Shmoys 1 The Promise of LP to Boost CSP Techniques for Combinatorial Problems Carla P. Gomes David Shmoys
1 CS 4700: Foundations of Artificial Intelligence Carla P. Gomes Module: Satisfiability (Reading R&N: Chapter 7)
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
Lukas Kroc, Ashish Sabharwal, Bart Selman Cornell University, USA SAT 2010 Conference Edinburgh, July 2010 An Empirical Study of Optimal Noise and Runtime.
1 Combinatorial Problems in Cooperative Control: Complexity and Scalability Carla Gomes and Bart Selman Cornell University Muri Meeting March 2002.
1 Message Passing and Local Heuristics as Decimation Strategies for Satisfiability Lukas Kroc, Ashish Sabharwal, Bart Selman (presented by Sebastian Brand)
Sampling Combinatorial Space Using Biased Random Walks Jordan Erenrich, Wei Wei and Bart Selman Dept. of Computer Science Cornell University.
Solution Counting Methods for Combinatorial Problems Ashish Sabharwal [ Cornell University] Based on joint work with: Carla Gomes, Willem-Jan van Hoeve,
Relaxed DPLL Search for MaxSAT (short paper) Lukas Kroc, Ashish Sabharwal, Bart Selman Cornell University SAT-09 Conference Swansea, U.K. July 3, 2009.
Spreadsheet Problem Solving
11/5/2003Probability and Statistics for Teachers, Math 507, Lecture 10 1 UNIFORM AND EXPONENTIAL RANDOM VARIABLES Sections
Chapter 4 Continuous Random Variables and Probability Distributions
Ch 8.1 Numerical Methods: The Euler or Tangent Line Method
Distributions of Randomized Backtrack Search Key Properties: I Erratic behavior of mean II Distributions have “heavy tails”.
OPTIMIZATION WITH PARITY CONSTRAINTS: FROM BINARY CODES TO DISCRETE INTEGRATION Stefano Ermon*, Carla P. Gomes*, Ashish Sabharwal +, and Bart Selman* *Cornell.
SVM by Sequential Minimal Optimization (SMO)
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
Error estimation Data Mining II Year Lluís Belanche Alfredo Vellido.
Math 3120 Differential Equations with Boundary Value Problems Chapter 2: First-Order Differential Equations Section 2-6: A Numerical Method.
1 MCMC Style Sampling / Counting for SAT Can we extend SAT/CSP techniques to solve harder counting/sampling problems? Such an extension would lead us to.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE COS302 MICHAEL L. LITTMAN FALL 2001 Satisfiability.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 3 Logic Representations (Part 2)
Benk Erika Kelemen Zsolt
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
CS 478 – Tools for Machine Learning and Data Mining Linear and Logistic Regression (Adapted from various sources) (e.g., Luiz Pessoa PY 206 class at Brown.
Heavy-Tailed Phenomena in Satisfiability and Constraint Satisfaction Problems by Carla P. Gomes, Bart Selman, Nuno Crato and henry Kautz Presented by Yunho.
TAMING THE CURSE OF DIMENSIONALITY: DISCRETE INTEGRATION BY HASHING AND OPTIMIZATION Stefano Ermon*, Carla P. Gomes*, Ashish Sabharwal +, and Bart Selman*
Statistics Presentation Ch En 475 Unit Operations.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module Logic Representations.
CpSc 881: Machine Learning Evaluating Hypotheses.
Fitting a Function to the Difficulty of Boolean Formulas Greg Dennis NMM Final Project.
Quality of LP-based Approximations for Highly Combinatorial Problems Lucian Leahu and Carla Gomes Computer Science Department Cornell University.
SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.
CPSC 422, Lecture 21Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 21 Oct, 30, 2015 Slide credit: some slides adapted from Stuart.
Accelerating Random Walks Wei Wei and Bart Selman.
Balance and Filtering in Structured Satisfiability Problems Henry Kautz University of Washington joint work with Yongshao Ruan (UW), Dimitris Achlioptas.
Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.
Probabilistic and Logical Inference Methods for Model Counting and Sampling Bart Selman with Lukas Kroc, Ashish Sabharwal, and Carla P. Gomes Cornell University.
Statistics Presentation Ch En 475 Unit Operations.
Eliminating non- binary constraints Toby Walsh Cork Constraint Computation Center.
Sampling Design and Analysis MTH 494 Lecture-21 Ossam Chohan Assistant Professor CIIT Abbottabad.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.
The expected value The value of a variable one would “expect” to get. It is also called the (mathematical) expectation, or the mean.
The Normal Approximation for Data. History The normal curve was discovered by Abraham de Moivre around Around 1870, the Belgian mathematician Adolph.
Introduction Sample surveys involve chance error. Here we will study how to find the likely size of the chance error in a percentage, for simple random.
Chapter 4 More on Two-Variable Data. Four Corners Play a game of four corners, selecting the corner each time by rolling a die Collect the data in a table.
© 2012 IBM Corporation Perfect Hashing and CNF Encodings of Cardinality Constraints Yael Ben-Haim Alexander Ivrii Oded Margalit Arie Matsliah SAT 2012.
The simple linear regression model and parameter estimation
Data Driven Resource Allocation for Distributed Learning
Statistical Methods For Engineers
Xbar Chart By Farrokh Alemi Ph.D
Dr. Arslan Ornek MATHEMATICAL MODELS
Presentation transcript:

Short XORs for Model Counting: From Theory to Practice Carla P. Gomes, Joerg Hoffmann, Ashish Sabharwal, Bart Selman Cornell University & Univ. of Innsbruck SAT Conference, 2007 Lisbon, Portugal

May 28, 2007SAT Introduction Problems of interest: 1.Model counting (#SAT) 2.Near-uniform sampling of solutions #P-hard problems, much harder than SAT DPLL / local search / conversion to normal forms available but don’t scale very well Applications to probabilistic reasoning, etc. A promising new solution approach: XOR-streamlining MBound for model counting [Gomes-Sabharwal-Selman AAAI’06] XorSample for sampling [Gomes-Sabharwal-Selman NIPS’06]

May 28, 2007SAT XOR-Based Counting / Sampling A relatively simple algorithm: Step 1. Add s uniform random “xor”/parity constraints to F Step 2. Solve with any off-the-shelf SAT solver Step 3. Deduce bounds on the model count of F MBound or output solution sample of F XorSample Can boost results further by using exact model counters Surprisingly good results! Counting : Can solve several challenging combinatorial problems previously out of reach Sampling : Much more uniform samples, fast

May 28, 2007SAT XOR-Based Counting / Sampling Key Features Quick estimates with provable correctness guarantees SAT solvers without modification for counting/sampling XOR / parity constraints E.g. a  b  d  g = odd is satisfied iff an odd number of a, b, d, g are True Random XOR constraints of length k for formula F  Choose k variables of F uniformly at random  Choose even/odd parity uniformly at random Focus of this work: What effect does length k have?

May 28, 2007SAT Long vs. Short XORs: The Theory Full-length XORs (half the number of vars of F) Provide provably accurate counts, near-uniform samples (both lowerbounds and upperbounds for model counting) Limited use: often do not propagate well in SAT solvers Short XORs (length ~ 8-20) Consistently much faster than full-length XORs Provide provably correct lowerbounds but no upperbounds / good samples Quality issue: in principle, can yield very poor lowerbounds Nevertheless, can short XORs provide good results in practice? [AAAI’06, NIPS’06]

May 28, 2007SAT Main Results Empirical study demonstrating that  Short XORs often surprisingly good on structured problem instances Evidence based on the fundamental factor determining quality: the variance of the random process  Variance drops drastically in XOR length range ~ 1-5  Initial results: required XOR length related to (local and global) “backbone” size

May 28, 2007SAT What Makes Short XORs Different? Key difference: Variance of the residual model count Consider formula F with 2 s* solutions. Add s XORs. Let X = residual model count Same expectation in both cases: E[X] = 2 s*-s Variance for full XORs : provably low! (Var[X]  E[X])  Reason: pairwise-independence of full-length XORs Variance for short XORs : can be quite high  No pairwise-independence Is variance truly very high in structured formulas in practice?

May 28, 2007SAT Experimental Setup  Goal: Evaluate how well short XORs behave for model counting and sampling Comparison with the ideal case: full-length XORs Short XORs clearly favorable w.r.t. time Our comparison w.r.t. quality of counts and samples  Evaluation object: variance of the residual count Directly determines the quality Compared with the ideal variance (the “ideal curve”) computed analytically  1, ,000 instances per data point

May 28, 2007SAT The Quantity Measured Let X = residual model count after adding s XORs Must normalize X and s appropriately to compare across formulas F with different #vars and #solns! Two tricks to make comparison meaningful: 1.Plot variance of normalized count: X’ = X / 2 s* E[X’] = 1 for every F 2.Use s = s* - c for some constant c Var[X’] approaches the same ideal value for every F c : constant number of remaining XORs

May 28, 2007SAT Experiments: The Ideal Curve  When variance of X’ is plotted with c remaining XORs, can prove analytically ideal-Var = 2 -c for full-length XORs  We plot sample standard deviation rather than variance ideal-s.s.d. = sqrt(2 -c ) For what XOR length does s.s.d.[X’] approach ideal-s.s.d.?

May 28, 2007SAT Latin Square Formulas, Order 6  Sample standard deviation initially decreases rapidly  XOR lengths 5-7: already quite close to the ideal curve  Not much change in s.s.d. after a while Medium size XORs don’t pay off well variables solutions 3 remaining XORs Ideal XOR length: (quasi-group with holes)

May 28, 2007SAT Latin Square Formulas, Order variables solutions 3 remaining XORs Ideal XOR length:  Similar behavior as Latin sq. of order 6 Even lower variance!  Instances with more solutions have larger variance

May 28, 2007SAT Logistics Planning Instance 352 variables 2 19 solutions 9 remaining XORs Ideal XOR length: 151  Variance drops sharply till XOR length 25  XOR lengths : very close to ideal behavior

May 28, 2007SAT Circuit Synthesis Problem 252 variables 2 97 solutions 10 remaining XORs Ideal XOR length: 126  Standard deviation quite high initially  But drops dramatically till length 7-8  Quite close to ideal curve at length 10

May 28, 2007SAT Random 3-CNF Formulas 100 variables solutions 7 remaining XORs Ideal XOR length: 50  XORs don’t behave as good as in structured instances E.g. formulas at ratio 4.2 needs length 40+ (ideal: 50)  Surprisingly, short XORs better at lower ratios! Recall: model counting observed to be harder at lower ratios

May 28, 2007SAT Understanding Short XORs What is it that makes short XORs work / not work well? Backbone of the solutions provides some insight. Intuitively, Large backbone  short XOR often involves only backbone variables  all or no solutions survive  high variance Small backbone or split (local) backbones  XOR involves non-backbone variables  some solutions survive no matter what  lower variance

May 28, 2007SAT Fixed-Backbone Formulas 50 variables solutions 10 remaining XORs Ideal XOR length: 25  As backbone size decreases, shorter and shorter XORs begin to perform well

May 28, 2007SAT Interleaved-Backbone Formulas 50 variables solutions 10 remaining XORs Ideal XOR length: 25  As backbone is split into more and more interleaved clusters backbones, shorter and shorter XORs begin to work well

May 28, 2007SAT Summary  Short XORs can perform surprising well in practice for model counting and sampling  Variance reduces dramatically at low XOR lengths Increasing XOR length pays off quite well initially but not so much later  Variance relates to solution backbones Slides available at the SAT-07 poster session on Thursday!