1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.

1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos ( UBC) and Thomas Stützle ( Darmstadt University of Technology, Germany)

2 SLS: general algorithmic framework for solving combinatorial problems

3 MPE in graphical models: many applications

4 Outline Most probable explanation (MPE) problem Most probable explanation (MPE) problem Problem definition Problem definition Previous work Previous work SLS algorithms for MPE SLS algorithms for MPE Illustration Illustration Previous SLS algorithms Previous SLS algorithms Guided Local Search (GLS) in detail Guided Local Search (GLS) in detail From Guided Local Search to GLS + From Guided Local Search to GLS + Modifications Modifications Performance gains Performance gains Comparison to state-of-the-art Comparison to state-of-the-art

5 MPE - problem definition (in most general representation: factor graphs) Given a factor graph Given a factor graph Discrete Variables X = {X 1,..., X n } Discrete Variables X = {X 1,..., X n } Factors  = {  1,...,  m } over subsets of X Factors  = {  1,...,  m } over subsets of X A factor  i over variables V i µ X assigns a non-negative number to every complete instantiation v i of V i A factor  i over variables V i µ X assigns a non-negative number to every complete instantiation v i of V i Find Find Complete instantiation {x 1,...,x n } maximizing  i=1 m  i [x 1,...,x n ] Complete instantiation {x 1,...,x n } maximizing  i=1 m  i [x 1,...,x n ] NP-hard (simple reduction from SAT) NP-hard (simple reduction from SAT) Also known as Max-product or Maximum a posteriori (MAP) Also known as Max-product or Maximum a posteriori (MAP)

6 Previous approaches for solving MPE Variable elimination / Junction tree Variable elimination / Junction tree Exponential in the graphical model´s induced width Exponential in the graphical model´s induced width Approximation with loopy belief propagation and its generalizations [Yedidia, Freeman, Weiss ´02] Approximation with loopy belief propagation and its generalizations [Yedidia, Freeman, Weiss ´02] Approximation with Mini Buckets (MB) [Dechter & Rish ´97] ! also gives lower & upper bound Approximation with Mini Buckets (MB) [Dechter & Rish ´97] ! also gives lower & upper bound Search algorithms Search algorithms Local Search Local Search Branch and Bound with various MB heuristics [Dechter´s group, ´99 - 05] UAI ´03: B&B with MB heuristic shown to be state-of-the-art Branch and Bound with various MB heuristics [Dechter´s group, ´99 - 05] UAI ´03: B&B with MB heuristic shown to be state-of-the-art

7 Motivation for our work B&B clearly outperforms best SLS algorithm so far, even on random problem instances [Marinescu, Kask, Dechter, UAI ´03] B&B clearly outperforms best SLS algorithm so far, even on random problem instances [Marinescu, Kask, Dechter, UAI ´03] MPE is closely related to weighted Max-SAT [Park ´02] MPE is closely related to weighted Max-SAT [Park ´02] For Max-SAT, SLS is state-of-the-art (at the very least for random problems) For Max-SAT, SLS is state-of-the-art (at the very least for random problems) Why is SLS not state-of-the-art for MPE ? Why is SLS not state-of-the-art for MPE ? Additional problem structure inside the factors Additional problem structure inside the factors But for completely random problems ? But for completely random problems ? SLS algos should be much better than they currently are SLS algos should be much better than they currently are We took the best SLS algorithm so far (GLS) and improved it We took the best SLS algorithm so far (GLS) and improved it

9 SLS for MPE – illustration X1X1X1X1 X2X2X2X2 2222 0021 010.7 100 111 200.9 210.2 X3X3X3X3 444400.9 10.1 X1X1 X2X2 X4X4 X3X3 11 22 33 44 55 X2X2X2X2 X3X3X3X3 X4X4X4X4 555500010 0010.9 0100 011100 10033.2 1010 11023.2 11113.7 X1X1X1X1 111100 121.2 20.1 2 100 X1X1 X3X3 33001.1 0123 100 110.7 202.7 2142  i=1 M  i [2,1,0,0] = 0.1 * 0.2 * 2.7 * 0.9 * 33.2 Instantiation:

10 SLS for MPE – illustration X1X1X1X1 X2X2X2X2 2222 0021 010.7 100 111 200.9 210.2 X3X3X3X3 444400.9 10.1 X1X1 X2X2 X4X4 X3X3 11 22 33 44 55 X2X2X2X2 X3X3X3X3 X4X4X4X4 555500010 0010.9 0100 011100 10033.2 1010 11023.2 11113.7 X1X1X1X1 111100 121.2 20.1 2 1!01!0 00 X1X1 X3X3 33001.1 0123 100 110.7 202.7 2142 Instantiation:  i=1 M  i [2,0,0,0] =  i=1 M  i [2,1,0,0] * *  i=1 M  i [2,0,0,0] =  i=1 M  i [2,1,0,0] * 0.9/0.2 * 10/33.2

11 Previous SLS algorithms for MPE Iterative Conditional Modes [Besag, ´86] Iterative Conditional Modes [Besag, ´86] Just greedy hill climbing Just greedy hill climbing Stochastic Simulation Stochastic Simulation Sampling algorithm, very poor for optimization Sampling algorithm, very poor for optimization Greedy + Stochastic Simulation [Kask & Dechter, ´99] Greedy + Stochastic Simulation [Kask & Dechter, ´99] Outperforms the above & simulated annealing by orders of magnitude Outperforms the above & simulated annealing by orders of magnitude Guided Local Search (GLS) [Park ´02] Guided Local Search (GLS) [Park ´02] (Iterated Local Search (ILS) [Hutter ´04]) (Iterated Local Search (ILS) [Hutter ´04]) Outperforms Greedy + Stochastic Simulation by orders of magnitude Outperforms Greedy + Stochastic Simulation by orders of magnitude

12 Guided Local Search (GLS) [Voudouris 1997] Subclass of Dynamic Local Search [Hoos & Stützle, 2004]: Iteratively: 1) Local search ! local optimum 2) Modify evaluation function Subclass of Dynamic Local Search [Hoos & Stützle, 2004]: Iteratively: 1) Local search ! local optimum 2) Modify evaluation function In local optima: penalize some solution features In local optima: penalize some solution features Solution features for MPE are partial assigments Solution features for MPE are partial assigments Evaluation fct. = Objective fct. - sum of respective penalties Evaluation fct. = Objective fct. - sum of respective penalties Penalty update rule experimentally designed Penalty update rule experimentally designed Performs very well across many problem classes Performs very well across many problem classes....

13 GLS for MPE [Park 2002] Initialize penalties to 0 Initialize penalties to 0 Evaluation function: Evaluation function: Obj. function - sum of penalties of current instantiation Obj. function - sum of penalties of current instantiation  i=1 m  i [x 1,...,x n ] -  i=1 p i [x 1,...,x n ]  i=1 m  i [x 1,...,x n ] -  i=1 p i [x 1,...,x n ] In local optimum: In local optimum: Choose partial instantiations (according to GLS update rule) Choose partial instantiations (according to GLS update rule) Increment their penalty by 1 Increment their penalty by 1 Every N  local optima Every N  local optima Smooth all penalties by multiplying them with  < 1 Smooth all penalties by multiplying them with  < 1 Important to eventually optimize the original objective function Important to eventually optimize the original objective function

15 GLS ! GLS + : Overview of modified components Modified evaluation function Modified evaluation function Pay more attention to the actual objective function Pay more attention to the actual objective function Improved caching of evaluation function Improved caching of evaluation function Straightforward adaption from SAT caching schemes Straightforward adaption from SAT caching schemes Tuning of smoothing parameter  Tuning of smoothing parameter  Over two orders of magnitude improvement ! Over two orders of magnitude improvement ! Initialization with Mini-Buckets instead of random Initialization with Mini-Buckets instead of random Was shown to perform better by [Kask & Dechter, 1999] Was shown to perform better by [Kask & Dechter, 1999]

16 GLS ! GLS + (1) Modified evaluation function GLS GLS  i=1 m  i [x 1,...,x n ] -  i=1 p i [x 1,...,x n ]  i=1 m  i [x 1,...,x n ] -  i=1 p i [x 1,...,x n ] Product of entries minus sum of penalties ¼ zero minus sum of penalties Almost neglecting objective function Product of entries minus sum of penalties ¼ zero minus sum of penalties Almost neglecting objective function GLS + GLS +  i=1 m log(  i [x 1,...,x n ]) -  i=1 p i [x 1,...,x n ]  i=1 m log(  i [x 1,...,x n ]) -  i=1 p i [x 1,...,x n ] Use logarithmic objective function Use logarithmic objective function Very simple, but much better results Very simple, but much better results Penalties are now just new temporary factors that decay over time! Penalties are now just new temporary factors that decay over time! Could be improved by dynamic weighting of the penalties Could be improved by dynamic weighting of the penalties

17 GLS ! GLS + (1) Modified evaluation function Much faster in early stages of the search Much faster in early stages of the search Speedups of about 1 order of magnitude Speedups of about 1 order of magnitude GLS GLS + GLS

18 Time complexity for a single best-improvement step: Time complexity for a single best-improvement step: Previously best caching:  (|V| £ |D V | £  V ) Previously best caching:  (|V| £ |D V | £  V ) Improved caching:  (|V improving | £ |D V |) Improved caching:  (|V improving | £ |D V |) GLS ! GLS + (2) Speedups by caching A A A A

19 GLS ! GLS + (3) Tuning the smoothing factor  [Park ´02] stated GLS to have ``no parameters´´ Changing  from Park`s setting 0.8 to 0.99 Sometimes from unsolvable to milliseconds Effect increases for large instances  1   =      = 0.99  = 0.999  = 1

20 GLS ! GLS + (4) Initialization with Mini-Buckets Sometimes a bit worse, sometimes much better Sometimes a bit worse, sometimes much better Particularly helps for some structured instances Particularly helps for some structured instances

22 Comparison based on [Marinescu, Kask, Dechter, UAI ´03] Branch & Bound with MB heuristic was state-of-the-art for MPE, even for random instances! Branch & Bound with MB heuristic was state-of-the-art for MPE, even for random instances! Scales better than original GLS with Scales better than original GLS with Number of variables Number of variables Domain size Domain size Both as anytime algorithm and in terms of time needed to find optimum Both as anytime algorithm and in terms of time needed to find optimum On the same problem instances, we show that our new GLS + scales better than their implementation with On the same problem instances, we show that our new GLS + scales better than their implementation with Number of variables Number of variables Domain size Domain size Density Density Induced width Induced width

23 Benchmark instances Randomly generated Bayes nets Randomly generated Bayes nets Graph structure: completely random/grid networks Graph structure: completely random/grid networks Controlled number of variables & domain size Controlled number of variables & domain size Random networks with controlled induced width Random networks with controlled induced width Bayesian networks from Bayes net repository Bayesian networks from Bayes net repository

24 Original GLS vs. B&B with MB heuristic : relative solution quality after 100 seconds for random grid networks of size NxN A A Small Medium Large

25 GLS + vs. GLS and B&B with MB heuristic : relative solution quality after 100 seconds for random grid networks of size NxN Small Medium Large

26 GLS + vs. B&B with MB heuristic : Solution time with increasing domain size on random networks Small Medium Large

27 Solution times with increasing induced width on random networks A d-BBMB s-BBMB Orig GLS GLS +

28 Results for Bayes net repository GLS + shows overall best performance GLS + shows overall best performance Only algorithm to solve Link network (in 1 second!) Only algorithm to solve Link network (in 1 second!) Problems for Barley and especially Diabetes Problems for Barley and especially Diabetes Preprocessing with partial variable elimination helps a lot Preprocessing with partial variable elimination helps a lot Can reduce #(variables) dramatically Can reduce #(variables) dramatically

29 Conclusions SLS algorithms are competitive for MPE solving SLS algorithms are competitive for MPE solving Scale very well, especially with induced width Scale very well, especially with induced width But they need careful design, analysis & parameter tuning But they need careful design, analysis & parameter tuning SLS and Machine Learning (ML) people should talk SLS and Machine Learning (ML) people should talk SLS can perform very well for some traditional ML problems SLS can perform very well for some traditional ML problems Our C source code is online Our C source code is online Please use it Please use it There‘s also a Matlab interface There‘s also a Matlab interface

30 Extensions in progress Real problem domains Real problem domains MRFs for stereo vision MRFs for stereo vision CRFs for sketch recognition CRFs for sketch recognition Domain-dependent extensions Domain-dependent extensions Hierarchical SLS for problems in computer vision Hierarchical SLS for problems in computer vision Automated parameter tuning Automated parameter tuning Use Machine Learning to predict runtime for different settings of algorithm parameters Use Machine Learning to predict runtime for different settings of algorithm parameters Use parameter setting with lowest predicted runtime Use parameter setting with lowest predicted runtime

31 The End Thanks to Thanks to Holger Hoos & Thomas Stützle Holger Hoos & Thomas Stützle Radu Marinescu for their B&B code Radu Marinescu for their B&B code You for your attention You for your attention

1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.

Similar presentations

Presentation on theme: "1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos.

Similar presentations

Presentation on theme: "1 Efficient Stochastic Local Search for MPE Solving Frank Hutter The University of British Columbia (UBC), Vancouver, Canada Joint work with Holger Hoos."— Presentation transcript:

Similar presentations

About project

Feedback