We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byAugustus Harper
Modified about 1 year ago
BIRS Workshop, Banff, Canada Jan 22, 2014 © 2014 IBM Corporation Resolution and Parallelizability: Barriers to the Efficient Parallelization of SAT Solvers George KatsirelosMIAT, INRA, Toulouse, France Ashish SabharwalIBM Watson, USA Horst SamulowitzIBM Watson, USA Laurent SimonUniv. Paris-Sud, LRI/CNRS, Orsay, France [published at AAAI-2013]
Resolution and Parallelizability © 2014 IBM Corporation Trend Towards Parallelization Focus Shifting From Single-Thread Performance to Multi-Processor Performance –100s and even 1000s of compute cores easily accessible –Classical Algorithm Parallelization, e.g., parallel sort, shortest path, PRAM model, AC circuits –Significant Advances in Data Parallelism e.g., MapReduce, Hadoop, SystemML, R statistics Challenge: Search and Optimization on 1000s of Processors –Tremendous advances in the Sequential case of Combinatorial Search E.g., SAT solvers can tackle instances with ~2M variables, 10M constraints! –Exponential search appears to be an “obvious” candidate to parallelize! –In fact, many SAT/CSP/MIP solvers already do support multi-core and multi-machine runs 2 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon
Resolution and Parallelizability © 2014 IBM Corporation Parallelization of Combinatorial Search Fact: State-of-the-Art Search Engines Do NOT Parallelize Well –Brute Force exponential search is, of course, trivial to parallelize –But sophisticated search engines that adapt (through e.g. clause learning, variable impact aggregation, etc.) have inherent sequential aspects –Modern SAT/MIP/”adapting”-CP solvers do not parallelize well Supporting data: next slide AAAI 2012 Challenge Paper on the topic [Hamadi & Wintersteiger 2012] –P-completeness of Unit Propagation a key barrier (solvers spend ~80% of the time Unit Propagating and we don’t know how to parallelize P well) –Our result: barriers exist even if Unit Propagation came for free! 3 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon
Resolution and Parallelizability © 2014 IBM Corporation Parallelization of Combinatorial Search: SAT Rather Disappointing Performance at SAT Competitions – e.g., in 2011: –Average speedup on 8 cores only ~1.8x, on 32 cores only ~3x –Top performing parallel solvers were based on little to no communication (CryptoMinisat-MT [Soos 2012], Plingeling [Biere 2012] ) –Winners were “simple” Portfolio solvers (ppfolio [Roussel], pfolioUZK [Wotzlaw et al] ) Plingeling-ats-587 [Dec 2013] –Single machine with 128 cores and 128 GB memory –Benchmark set used in this work, restricted to the 142 instances solved by 1 core in [10,5000) seconds 4 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon
Resolution and Parallelizability © 2014 IBM Corporation5 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon What makes parallelization of SAT solvers hard? Can we obtain insights into their behavior beyond eventual wall-clock performance?
Resolution and Parallelizability © 2014 IBM Corporation Contributions of the Work A New Systematic Study of Parallelism in the Context of Search through the Lens of Proof Complexity –Focus on understanding rather than on engineering –Are there inherent bottlenecks that may hinder parallelization, irrespective of which heuristics are used to share information? 1.A Practical Study: Interesting properties of Actual Proofs –Proofs generated by state-of-the-art SAT solvers contain narrow bottlenecks 2.Proof-Based Measures that capture Best-Case Parallelizability –Coarse measure: “Depth” of the proof graph –Refined measure: Makespan of a resource constrained scheduling problem 3.Empirical Findings: Correlations and Parallelization Limits –Typical sequential proofs are not very parallelizable even in the best case! –“Schedule speedup” / makespan correlates with observed speedup 6 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon
Resolution and Parallelizability © 2014 IBM Corporation Approach: Proof Complexity (applied here to Typically Generated Proofs) Proof Complexity [Cook & Reckhov, 1979] : Study of the nature (e.g., size, width, space, depth, “shape”, etc.) of Proofs of Unsatisfiability –Resolution Graph of Conflict-Directed-Clause-Learning (CDCL) SAT Solvers Runtime(any SAT solver, F) min proofs Size(Resolution proof of F) –Note: Insights applicable also to Satisfiable instances! Solvers prove a lot of sub-formulas to be unsatisfiable before hitting the first solution Formal characterization [Achlioptas et al, 2001 & 2004] Study of Proofs has provided strong insights into CDCL SAT solvers –What does “clause learning” bring? –What do “restarts” add? [Beame et al, 2004; Buss et al, 2008, 2012; Hertel et al, 2008; Pipatsrisawat et al, 2011] 7 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon Worst case / Best case results
Resolution and Parallelizability © 2014 IBM Corporation8 Underlying Inference Principle: Resolution CDCL SAT solvers produce Resolution Derivations Proof Graph and Depth: –Each initial and derived constraint is a node, annotated with its proof depth –proofdepth(initial clause C) = 0 –proofdepth(derived clause C) = 1 + max parents proofdepth(parent(C)) C10C20C30C40C50C60 C71 C82 C91 C103 C112 C123 C134 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon Constraint ID Depth F :
Resolution and Parallelizability © 2014 IBM Corporation9 How Parallelizable are Resolution Refutations? Refutation(F) = Resolution Proof that derives the empty (“false”) clause Depth of the proof clearly limits the amount of potential parallelization –Chain of dependencies –Theorem: All Resolution Proof Graphs of certain “pebbling” style instances have large depth; also holds for all Conflict Resolution Graphs (XOR substitution trick) However, proofdepth bound on parallelization is very crude –Does not explain poor performance with small k (e.g., 8, 32, … processors) How does a typical sequential SAT solver proof look like? –Setup for Experiments: Sequential Glucose 2.1 extended with proof output GluSatX10: using SatX10 to run a k-processor version of Sequential Glucose –Working Assumption: Proofs produced by GluSatX10 on k cores look “similar” to proofs produced by Sequential Glucose Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon [IBM Teams: X10 and SAT/CSP] ** simplified statements; see paper for more formal notions
Resolution and Parallelizability © 2014 IBM Corporation10 Proof Graph Example: Very Complex Structure Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon [Easy sequential case, solved in ~30 seconds]
Resolution and Parallelizability © 2014 IBM Corporation11 Bottlenecks in Typical SAT Proofs Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon Proofs Generated by SAT Solvers Exhibit Surprisingly Narrow “Bottlenecks”, i.e., Depths with Very Few (~1) Clauses! –Nothing deeper can be derived before bottleneck clauses Sequentiality Depth in the proof Number of Clauses (log-scale) Derived at that Depth
Resolution and Parallelizability © 2014 IBM Corporation12 Best-Case Parallelization with k Processors Given Proof P and k Processors, Best-Case Parallelization of P = Resource Constrained Scheduling Problem with Precedences Let M k (P) = makespan of the optimal schedule of P on k processors –Even approximating M k (P) within 4/3 is NP-hard, but (2 – 1/k) approx. is easy Best-Case k processor speedup on P: S k (P) = M 1 (P) / M k (P) Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon C10C20C30C40C50C60 C71 C82 C91 C103 C112 C123 C134 Constraint ID Depth C’91 Example: M 1 (P) = 8 M 2 (P) = 5 M 3 (P) = 4 M 4 (P) = 4 … depth =
Resolution and Parallelizability © 2014 IBM Corporation13 Makespan vs. Proof Depth Schedule Makespan yields a finer grained lower bound, S k (P), on best-case parallelization than proof depth –proofdepth(P) : limit of parallelization of P with “infinite” processors –M k (P) proofdepth(P) –M k (P) proofdepth(P) as k Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon
Resolution and Parallelizability © 2014 IBM Corporation14 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon Empirical Findings
Resolution and Parallelizability © 2014 IBM Corporation15 Even Best-Case Parallelization Efficiency is Low Beyond 100 Processors Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon Best-Case Efficiency of parallelizing P with k processors = 100 * (S k (P) / k) E.g., 100% = full utilization of k processors speedup = k
Resolution and Parallelizability © 2014 IBM Corporation16 Proofs of Some Instances Exhibit Very Low Best-Case Schedule Speedup Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon A) Even with 1024 processors, best-case speedup ~ B) 128 processors insufficient to achieve a speedup of ~ 90
Resolution and Parallelizability © 2014 IBM Corporation17 Best-Case Schedule Speedup Correlates With Actual Observed Runtime Speedup Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon Average over a sliding window (Makes the study of the best-case schedule speedup relevant)
Resolution and Parallelizability © 2014 IBM Corporation Summary A New Systematic Study of Parallelism in the Context of Search through the Lens of Proof Complexity –Focus on understanding rather than on engineering Main Findings: A.Typical Sequential Refutations Contain Surprisingly Narrow Bottlenecks B.Typical Sequential Refutations are Not Parallelizable Beyond a Few Processors, even in the best case of offline ‘schedule speedup’ produced in hindsight C.Observed Runtime Speedup with k processors weakly correlates with Best-Case Schedule Speedup of a Sequential Proof produced in hindsight Open Question: Can we design SAT solvers that generate Proofs that are inherently More Parallelizable? Caveat: assumption that proofs generated by GluSatX10 on k cores look “similar” to proofs generated by Sequential Glucose 18 Banff Workshop on SAT, 2014 | Katsirelos, Samulowitz, Sabharwal, Simon
AAAI 2013 Conference, Bellevue, WA AAAI 2013 © 2013 IBM Corporation Resolution and Parallelizability: Barriers to the Efficient Parallelization of SAT.
SAT 2009 Ashish Sabharwal Backdoors in the Context of Learning (short paper) Bistra Dilkina, Carla P. Gomes, Ashish Sabharwal Cornell University SAT-09.
1 Understanding the Power of Clause Learning Ashish Sabharwal, Paul Beame, Henry Kautz University of Washington, Seattle IJCAI ConferenceAug 14, 2003.
1 P NP P^#P PSPACE NP-complete: SAT, propositional reasoning, scheduling, graph coloring, puzzles, … PSPACE-complete: QBF, planning, chess (bounded), …
Proofs from SAT Solvers Yeting Ge ACSys NYU Nov
Exploiting SAT solvers in unbounded model checking K. L. McMillan Cadence Berkeley Labs.
Tractable and intractable problems for parallel computers COMP 308.
1 Short Term Scheduling. 2 Planning horizon is short Multiple unique jobs (tasks) with varying processing times and due dates Multiple unique jobs.
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
1 The Theory of NP-Completeness 2010/11/30 2 Polynomial-time Reductions We want to solve a problem R; we already have an algorithm for a problem S We.
08/1 Foundations of AI 8. Satisfiability and Model Construction Davis-Putnam, Phase Transitions, GSAT Wolfram Burgard and Bernhard Nebel.
Time-Space Tradeoffs in Resolution: Superpolynomial Lower Bounds for Superlinear Space Chris Beck Princeton University Joint work with Paul Beame & Russell.
Reduction of Interpolants for Logic Synthesis John Backes Marc Riedel University of Minnesota Dept.
Boolean Satisfiability and SAT Solvers SAV, March 18 th, 2015.
1 Backdoors To Typical Case Complexity Ryan Williams Carnegie Mellon University Joint work with: Carla Gomes and Bart Selman Cornell University.
IBM Labs in Haifa © 2005 IBM Corporation Adaptive Application of SAT Solving Techniques Ohad Shacham and Karen Yorav Presented by Sharon Barner.
1 The Theory of NP-Completeness 2 Cook ’ s Theorem (1971) Prof. Cook Toronto U. Receiving Turing Award (1982) Discussing difficult problems: worst case.
Tommy Messelis * Stefaan Haspeslagh Burak Bilgin Patrick De Causmaecker Greet Vanden Berghe *
RESOURCES, TRADE-OFFS, AND LIMITATIONS Group 5 8/27/2014.
Short XORs for Model Counting: From Theory to Practice Carla P. Gomes, Joerg Hoffmann, Ashish Sabharwal, Bart Selman Cornell University & Univ. of Innsbruck.
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Mehdi Kargar Aijun An York University, Toronto, Canada Keyword Search in Graphs: Finding r-cliques.
Large-scale Hybrid Parallel SAT Solving Nishant Totla, Aditya Devarakonda, Sanjit Seshia.
The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.
10/7/2014 Constrainedness of Search Toby Walsh NICTA and UNSW
Distributions of Randomized Backtrack Search Key Properties: I Erratic behavior of mean II Distributions have “heavy tails”.
Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
2101INT – Principles of Intelligence Systems Lecture 3.
Balance and Filtering in Structured Satisfiability Problems Henry Kautz University of Washington joint work with Yongshao Ruan (UW), Dimitris Achlioptas.
Complexity Classes Kang Yu 1. NP NP : nondeterministic polynomial time NP-complete : 1.In NP (can be verified in polynomial time) 2.Every problem in NP.
SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.
Beating Brute Force Search for Formula SAT and QBF SAT Rahul Santhanam University of Edinburgh.
Counting the bits Analysis of Algorithms Will it run on a larger problem? When will it fail?
Lukas Kroc, Ashish Sabharwal, Bart Selman Cornell University, USA SAT 2010 Conference Edinburgh, July 2010 An Empirical Study of Optimal Noise and Runtime.
© J. Christopher Beck Lecture 6: Job Shop Scheduling Introduction.
Paging for Multi-Core Shared Caches Alejandro López-Ortiz, Alejandro Salinger ITCS, January 8 th, 2012.
Umans Complexity Theory Lectures Lecture 1a: Problems and Languages.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
An Efficient Algorithm for Enumerating Pseudo Cliques Dec/18/2007 ISAAC, Sendai Takeaki Uno National Institute of Informatics & The Graduate University.
1 Completeness and Complexity of Bounded Model Checking.
Why almost all satisfiable k - CNF formulas are easy? Danny Vilenchik Joint work with A. Coja-Oghlan and M. Krivelevich.
Relaxed DPLL Search for MaxSAT (short paper) Lukas Kroc, Ashish Sabharwal, Bart Selman Cornell University SAT-09 Conference Swansea, U.K. July 3, 2009.
1 Abstraction Refinement for Bounded Model Checking Anubhav Gupta, CMU Ofer Strichman, Technion Highly Jet Lagged.
1 Patch Complexity, Finite Pixel Correlations and Optimal Denoising Anat Levin, Boaz Nadler, Fredo Durand and Bill Freeman Weizmann Institute, MIT CSAIL.
1 NP-Complete Problems. 2 We discuss some hard problems: how hard? (computational complexity) what makes them hard? any solutions? Definitions
CS 3343: Analysis of Algorithms Lecture 25: P and NP Some slides courtesy of Carola Wenk.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
© 2017 SlidePlayer.com Inc. All rights reserved.