GridSAT Portal: A Grid Portal for Solving Satisfiability Problems Wahid Chrabakh and Rich Wolski University of California, Santa Barbara
Challenging Scientific Problems u Computationally demanding – Large compute power – Extended Periods of time u Infrastructure: – Desktops, Clusters, Supercomputers u Common Resource Usage: – Most suitable for co-located nodes – Determine number of nodes to use – Use all nodes until termination criteria reached
Satisfiability u Example of dynamic resource use u Application Characteristics: – Branch-and-bound – Unpredictable runtime behavior – Memory Intensive: Internal database grows overwhelming RAM – CPU intensive: 100% CPU load
Satisfiability Problem(SAT) u Set of variables V={v i | i=1,…,k} u Literal: a variable or its complement u Problems in CNF form: community standard u Clause: OR of a set of literals u Conjunctive Normal Form: F=C 1 C 2 C 3 … C k u Standard File format: p cnf num_vars num_clauses c comments +v1 –v2 … +v213 0
Satisfiability Applications u Circuit Design u FPGA routing u Model Checking: – AI, software u Security u Scheduling u Theoretical: – physics, chemistry, combinatorics u Many More…
SAT Community u Communities: – SATLive: News, forums, links, documents – SATEx: Experimentation and execution system – SATLIB: Dynamic set of Benchmarks Freely available solvers
Who uses SAT Live! u Period: Sep Jan 2003 – 21,766 distinct hosts u Jan : 524 distinct hosts u SATLIB: 250 hits/month
SAT Competition u u 55 Sequential Solvers: circus, circush0, cls, compsat, eqube2, forklift, funex, gasat, isat1, tts-2-0, unitwalk, walksatauto, walksatmp, walksatskc, werewolf, wllsatv1, zchaff, zchaff_rand circuscircush0clscompsateqube2 forkliftfunexgasatisat1tts-2-0unitwalkwalksatautowalksatmp walksatskcwerewolfwllsatv1zchaffzchaff_rand u Execution uses SAT-Ex u Two rounds: – First round: easy problems – Second round: harder problems u Awards to category leaders for SAT, UNSAT and overall u Challenging set: some problems left unsolved
Benchmarks: u Community submitted benchmarks u Crafted Benchmark: (38 MB) – Especially made to give a hard time to the solver u Random Benchmark: (11 MB) u Industrial Benchmark: (2 GB) – REAL industrial instances from all domains
GridSAT: The Solver u Parallel distributed SAT solver based on GridSAT u Based on zChaff leading sequential Solver u GridSAT beats zChaff on problems that zChaff can solve u GridSAT Solves problems which were not previously solved
GridSAT: Grid Aware u Highly Portable Components u Uses resources simultaneously: – Single nodes, Clusters, SuperComputers – Resources may leave and join at any time u Fault-tolerant: – Error detection & checkpointing – All resources can/do fail: – Even reliable resources: Maintenance & upgrade periods u Reactive to Resource Composition and Load: Migration
How to make GridSAT available to users? u Deploy GridSAT locally by interested users – Complex – Not enough computational resources u Feedback from SAT experts: – Make it available through a portal – Simple interface: minimal user input u GridSAT Portal: orca.cs.ucsb.edu/sat_portalorca.cs.ucsb.edu/sat_portal u Test problems: orca.cs.ucsb.edu/sat_portal/test_problems.htm orca.cs.ucsb.edu/sat_portal/test_problems.htm
Internal Design WebServer User DataStarTeraGridDesktop Machines GridSAT Coordinator
User accounts:
Problem Submission
List Problems
Detailed Report
Budget based Scheduling u CPU count or timeout may not be fulfilled – CPU count: too large – Time limit: too large or too small u Find closest job to user request u May need multiple jobs u Use Max CPUs * Timeout as a budget: – Debit from budget for every job
Conclusion u New science and engineering portal u GridSAT: Grid enabled application manages resources u Web Portal: – Launch coordinators – Provide feedback and Accounting u Challenge: – Provide compelling service to get community interested
Thanks u LRAC Allocation through NSF u TeraGrid: – SDSC, NCSA, PSC, TACC u DataStar at SDSC: also BlueHorizon u Mayhem Lab at UCSB
User Environment u Input: – Problem in standard CNF format – Max number of CPUs to use – Timeout period u Feedback: – Jobs: resource, status, submit, start and end times – Total number of active processors – CPU*hours consumed – Number of checkpoints – Final result: UNSAT or SAT instance
Programming Models u Synchronous Model: – Predictable Space: number of nodes, memory – Predictable Time: How long per instance – Synchronization Barrier – Fits Cluster Model (MPI) u Asynchronous Model: – Dynamic resource requirement, – Variable and Unpredictable duration – Asynchronous Events – Fits Computational Grid environment
Thanks u LRAC Allocation through NSF u TeraGrid: – SDSC, NCSA, PSC, TACC u DataStar at SDSC: also BlueHorizon u Mayhem Lab at UCSB