Provable Submodular Minimization using Wolfe’s Algorithm Deeparnab Chakrabarty (Microsoft Research) Prateek Jain (Microsoft Research) Pravesh Kothari (U.

Slides:

Advertisements

Similar presentations

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.

Advertisements

Beyond Convexity – Submodularity in Machine Learning

CPSC 455/555 Combinatorial Auctions, Continued… Shaili Jain September 29, 2011.

1 Matching Polytope x1 x2 x3 Lecture 12: Feb 22 x1 x2 x3.

Bregman Iterative Algorithms for L1 Minimization with

Generalization and Specialization of Kernelization Daniel Lokshtanov.

Graph Cut Algorithms for Computer Vision & Medical Imaging Ramin Zabih Computer Science & Radiology Cornell University Joint work with Y. Boykov, V. Kolmogorov,

Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.

Variational Inference in Bayesian Submodular Models

A Randomized Polynomial-Time Simplex Algorithm for Linear Programming Daniel A. Spielman, Yale Joint work with Jonathan Kelner, M.I.T.

1 Logic-Based Methods for Global Optimization J. N. Hooker Carnegie Mellon University, USA November 2003.

Computational Geometry The art of finding algorithms for solving geometrical problems Literature: –M. De Berg et al: Computational Geometry, Springer,

Approximation Algoirthms: Semidefinite Programming Lecture 19: Mar 22.

1 Can this be generalized?  NP-hard for Potts model [K/BVZ 01]  Two main approaches 1. Exact solution [Ishikawa 03] Large graph, convex V (arbitrary.

Algorithms for Max-min Optimization

Computational Geometry The art of finding algorithms for solving geometrical problems Literature: –M. De Berg et al: Computational Geometry, Springer,

Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.

Exercise 1- 1’ Prove that if a point B belongs to the affine / convex hull Aff/Conv (A 1, A 2, …, A k ) of points A 1, A 2,…, A k, then: Aff/Conv (A 1,

Introduction to Linear and Integer Programming Lecture 7: Feb 1.

Pushkar Tripathi Georgia Institute of Technology Approximability of Combinatorial Optimization Problems with Submodular Cost Functions Based on joint work.

Approximation Algorithm: Iterative Rounding Lecture 15: March 9.

1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 23 Instructor: Paul Beame.

Exploiting Duality (Particularly the dual of SVM) M. Pawan Kumar VISUAL GEOMETRY GROUP.

CSE 421 Algorithms Richard Anderson Lecture 27 NP Completeness.

Stereo Computation using Iterative Graph-Cuts

1 On the Computation of the Permanent Dana Moshkovitz.

The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.

Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy under contract.

Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.

Minimum Spanning Trees

Introduction Outline The Problem Domain Network Design Spanning Trees Steiner Trees Triangulation Technique Spanners Spanners Application Simple Greedy.

Submodularity in Machine Learning

C&O 355 Lecture 2 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A.

online convex optimization (with partial information)

Nattee Niparnan. Easy & Hard Problem What is “difficulty” of problem? Difficult for computer scientist to derive algorithm for the problem? Difficult.

CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct

Minimizing general submodular functions

CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.

1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.

Approximation Algorithms for Prize-Collecting Forest Problems with Submodular Penalty Functions Chaitanya Swamy University of Waterloo Joint work with.

Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.

NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.

Lecture 6 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.

Algorithmic Game Theory and Internet Computing Vijay V. Vazirani Georgia Tech Primal-Dual Algorithms for Rational Convex Programs II: Dealing with Infeasibility.

Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.

A Unified Continuous Greedy Algorithm for Submodular Maximization Moran Feldman Roy SchwartzJoseph (Seffi) Naor Technion – Israel Institute of Technology.

Submodular set functions Set function z on V is called submodular if For all A,B µ V: z(A)+z(B) ¸ z(A[B)+z(AÅB) Equivalent diminishing returns characterization:

Lecture 25 NP Class. P = ? NP = ? PSPACE They are central problems in computational complexity.

A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.

Maximizing Symmetric Submodular Functions Moran Feldman EPFL.

NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.

Optimization - Lecture 5, Part 1 M. Pawan Kumar Slides available online

Submodularity Reading Group Submodular Function Minimization via Linear Programming M. Pawan Kumar

Approximation Algorithms based on linear programming.

Rounding-based Moves for Metric Labeling M. Pawan Kumar École Centrale Paris INRIA Saclay, Île-de-France.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

TU/e Algorithms (2IL15) – Lecture 8 1 MAXIMUM FLOW (part II)

Submodularity Reading Group Matroids, Submodular Functions M. Pawan Kumar

Lap Chi Lau we will only use slides 4 to 19

Topics in Algorithms Lap Chi Lau.

Markov Random Fields with Efficient Approximations

A note on the Survivable Network Design Problem

James B. Orlin Presented by Tal Kaminker

k-center Clustering under Perturbation Resilience

Instructor: Shengyu Zhang

3.5 Minimum Cuts in Undirected Graphs

Algorithms (2IL15) – Lecture 7

Quantum Foundations Lecture 3

Presentation transcript:

Provable Submodular Minimization using Wolfe’s Algorithm Deeparnab Chakrabarty (Microsoft Research) Prateek Jain (Microsoft Research) Pravesh Kothari (U. Texas)

Submodular Functions f : Subsets of {1,2,..,n}  integers Diminishing Returns Property. T T S j f(S+j) – f(S) f(T+j) – f(T) f may or may not be monotone.

Sensor Networks Universe: Sensor Locations. f(A) = “Area covered by sensors” 3 1 j 2

Submodularity Everywhere Economics Biology Information Theory Computer Vision Probability Telecomm Networks Document Summarization Speech Processing Machine Scheduling

Image Segmentation (Boykov, Veksler, Zabih 2001) (Kolmogorov Boykov 2004) (Kohli, Kumar, Torr 2007) (Kohli Ladicky Torr 2009) X = arg min E(X|D) Observed Image Labelling Energy minimization done via reduction to submodular function minimization. “Energy” function

Submodular Function Minimization Find set S which minimizes f(S) NP∩ co-NP. P Ellipsoid Combinatorial Poly 1970 Edmonds 1981 Grotschel Lovasz Schrijver 2001 Iwata Fleischer Fujishige + Schrijver Current Best 2006 Orlin 1984 Fujishige’s Reduction To SFM 1976 Wolfe’s Projection Heuristic Fujishige-Wolfe Heuristic for SFM. O(n 5 T f + n 6 ) Time taken to evaluate f.

Theory vs Practice #vertices: power of 2 Running time (log-scale) (Fujishige, Isotani 2009) Cut functions from DIMACS Challenge

Is it good in theory? Today

Fujishige-Wolfe Heuristic Fujishige Reduction. Submodular minimization reduced to finding nearest-to-origin point (i.e., a projection) of the base polytope. Wolfe’s Algorithm. Finds the nearest-to-origin point of any polytope. Reduces to linear optimization over that polytope.

Our Results First convergence analysis of Wolfe’s algorithm for projection on any polytope. How quickly can we get within ε of optimum? (THIS TALK) Robust generalization of Fujishige Reduction. When small enough, ε-close points can give exact submodular function minimization.

Base Polytope Submodular function f BfBf Linear Optimization in almost linear time!

If x * is the closest-to-origin point of B f, then A = {j : x * j ≤ 0} is a minimizer of f. Fujishige’s Theorem BfBf x*x* 0

A Robust Version Can read out a set B from x such that: f(B) ≤ f(A) + 2nε x*x* BfBf 0 x Let x satisfy ||x-x * || ≤ ε. If f is integral, ε < 1/2n implies exact SFM.

Wolfe’s Algorithm: Projection onto a polytope 0

Geometrical preliminaries Affine Hull: aff(S) Convex Hull: conv(S) Finding closest-to-origin point on aff(S) is easy Finding it on conv(S) is not.

Corrals Set S of points s.t. the min-norm point in aff(S) lies in conv(S). Trivial Corral Corral Not a Corral

Wolfe’s algorithm in a nutshell Moves from corral to corral till optimality. In the process it goes via “non-corrals”.

Checking Optimality Not Optimal Optimal x x*x*

Wolfe’s Algorithm: Details

If S is a corral: Major Cycle x = min norm point in aff(S). x q Major cycle increments |S|. S = S + q.

y = min-norm point in aff(S) x old y x = pt on [y,x old ] ∩ conv(S) closest to y x Minor cycle decrements |S|. Remove irrelevant points from S. If S is not a corral: Minor Cycle

Summarizing Wolfe’s Algorithm State: (x,S). x lies in conv(S). Each iteration is either a major or a minor cycle. Linear Programming and Matrix Inversion. Major cycles increment and minor cycles decrement |S|. In < n minor cycles, we get a major cycle, and vice versa. Norm strictly decreases. Corrals can’t repeat. Finite termination.

Our Theorem For any polytope P, for any ε > 0, in O(nD 2 / ε 2 ) iterations Wolfe’s algorithm returns a point x such that ||x – x * || ≤ ε where D is the diameter of P. For SFM, the base polytope has diameter D 2 < nF 2.

Outline of the Proof Significant norm decrease when far from optimum. Will argue this for two major cycles with at most one minor cycle in between.

Two Major Cycles in a Row x1x1 q1q1 x1x1 q1q1 Drop x2x2

Major-minor-Major x1x1 q1q1 x1x1 x2x2 Corral aff(S + q 1 ) is the whole 2D plane. Origin is itself closest-to-origin

Major-minor-Major x1x1 q1q1 x1x1 x2x2 Either x 2 “far away” from x 1 implying ||x 1 || 2 - ||x 2 || 2 is large. Or, x 2 “behaves like” x 1, and ||x 2 || 2 - ||x 3 || 2 is large. x1x1 x2x2 x3x3 q1q1 Corral

Outline of the Proof Significant norm decrease when far from optimum. Will argue this for two major cycles with at most one minor cycle in between. Simple combinatorial fact: in 3n iterations there must be one such “good pair”.

Take away points. Analysis of Wolfe’s algorithm, a practical algorithm. Can one remove dependence on F? Can one change the Fujishige-Wolfe algorithm to get a better one, both in theory and in practice?

Thank you.