Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1

Name: Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1
Uploaded: 2017-08-20T10:25:58+00:00
Duration: PTM28S23
Description: Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1

Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1
Distributed Cosegmentation via Submodular Optimization on Anisotropic Diffusion Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1 1: School of Computer Science, Carnegie Mellon University 2: Computer Science Department, Stanford University November 9, 2011

Outline Problem Statement Submodular Optimization on Diffusion
Applications Diversity Ranking Single Image Segmentation Cosegmentation Experiments Conclusion

Image Cosegmentation Remove the ambiguity of what should be segmented out? Jointly segment M images into K regions ! Let me begin with problem definition. Image segmentation is an ill-posed problem. Even with this simple duck image, there are a lot of different ways of segmenting this. Image segmentation is one way of removing this ambiguity. The idea is simple; if we have multiple images that contain the same objects or regions, then we can infer them to be segmented together. In a simplified definition, the cosegmentation is to jointly segment m number of multiple images into K different regions. Here, M is three, and K is two. (M = 3, K = 2) Rother et al. 2006 Hochbaum and Singh, 2009 Joulin et al, 2010 Batra et al, 2010 Mukherjee et al, 2011 Vincente et al, 2010, 2011

Why is Cosegmentation Interesting?
Wide potential in Web applications Photo-taking patterns of general users My son joined baseball club. We believe that the cosegmentation has a wide potential in web applications. As you can see, it is very likely that the same subjects repeatedly appear across their photo albums, which is a typical case where cosegmentation is useful. I saw dolphins in aquarium.

Our Approach Major challenges for web photos (1) Large-scale
(2) Highly-variable Jointly segment M images into K regions Work Model / Algorithm M K Ours [J10] [M11] [H09] [R06] [B10] [V10] Anisotropic Diffusion/ Submodularity Discriminative Clustering MRF+ Rank-1 global / Iterative opt. MRF+Reward global / Graph Cuts MRF+L1 global / Trust Region GC Boykov-Jolly / Graph Cuts Boykov-Jolly / Dual Decomposition 103 30 2 50 Any We first identified these two as the main challenges of web photos, and connect them with the problem statement of cosegmentaiton, especially, these two numbers. The algorithm should work with a large M for scalability and arbitrary K for highly variable contents of web images. This table summarizes the comparison between our method and other previous work. In sum, our approach is unique in terms of M and K. Most previous work is able to handle only binary segmentation with K of 2. On the other hand, our approach can be used with arbitrary K. Moreover, most previous work has been applied to only tens of images. But, the dataset sizes in our experiments are larger by an order of magnitude. [R06] Rother et al. 2006 [H09] Hochbaum and Singh, 2009 [B10] Batra et al, 2010 [J10] Joulin et al, 2010 [V10] Vincente et al, 2010 [M11] Mukherjee et al, 2011

Contributions A New optimization framework
Constant-factor approximation of optimal Easily parallelizable Automatic selection of K Robust against wrong K Work Model / Algorithm M K Ours [J10] [M11] [H09] [R06] [B10] [V10] Anisotropic Diffusion/ Submodularity Discriminative Clustering MRF+ Rank-1 global / Iterative opt. MRF+Reward global / Graph Cuts MRF+L1 global / Trust Region GC Boykov-Jolly / Graph Cuts Boykov-Jolly / Dual Decomposition 103 30 2 50 Any In addition to the uniqueness in M and K, our optimization method also provides other important advantages. From an algorithmic point of view, our method can achieve at least a constant factor of the optimal solution. Our approach is easily parallelizable, and can support automatic selection of K, and is robust against a wrong input of K. You don’t need to take them serious right now. Through this talk, I will show them with examples. [R06] Rother et al. 2006 [H09] Hochbaum and Singh, 2009 [B10] Batra et al, 2010 [J10] Joulin et al, 2010 [V10] Vincente et al, 2010 [M11] Mukherjee et al, 2011

Diffusion Diffusion in physics Examples
Spread of particles (or energy) through random motion from high concentration to low concentration [Wikipedia] Examples Electric current Heat diffusion Heat Equation (Partial Differential Equation) Temperature Diffusivity (conductance) tensor Our framework is based on the diffusion in physics. So, in this talk, I will frequently use the analogy of heat diffusion. And, the heat diffusion is described by this partial differential equation. Here, u of x,t is the temperature at point x at time t. Here, D is called diffusivity or conductance.

Optimization Maximize the sum of temperature of the system
K heat sources This is our key optimization in this paper. We’d like to maximize the sum of temperature of the system, and the system undergoes diffusion, and we have two boundary conditions. For better understanding, let me give you a physical analogy. Suppose that you are given a metal plate in the air. Now, it is our system. The air temperature is fixed at zero and you are given K number of heat sources and their temperatures are always one. Our goal is to cleverly place these K heat sources on the plate to maximize the sum of temperature of the system. Environment temperature Maximize the sum of temperature

Correspondences Temperature maximization and Image Segmentation max
Heat Diffusion Points Temperature Heat sources Conductance Image Segmentation Pixels Segmentation confidence Segment centers Similarity btw features of pixels You may wonder why we need care about temperature maximization for image segmentation. So, let me give terminological correspondence between them. The points in the system corresponds pixels of an image, the temperature is segmentation confidence, and heat sources can be thought of segment centers, and the conductance can be computed by similarity between features of pixels. Therefore, this temperature maximization can be understood in terms of image segmentation like this. Our goal is to maximize the sum of segmentation confidence of every pixel by cleverly choosing K segment centers. We can say the ideal segmentation is where every pixel has confidence one to be clustered. Therefore, by solving this optimization, we can obtain an optimal k-way segmentation. Image Segmentation Select K pixels as segment centers, to maximize sum of segmentation confidence of every pixel.

Optimization How can we solve this?
max s.t [Theorem] (Neuhauser, Wolsey, Fisher 1978) Let u be , nondecreasing, and submodular. Then, the greedy algorithm finds a set such that 0.632. Obviously, the next question should be how to solve this maximization. Our approach is based on this famous theorem about submodular maximization under cardinality constraint. If the objective function satisfies these conditions, for example if it is nondecreasing and submodular, then a simple greedy algorithm can achieve at least constant factor approximation to this maximization. Here the factor is

Submodularity on Anisotropic Diffusion
[Theorem] Suppose that system is under linear anisotropic diffusion Let be temperature at time t point x when heat sources are attached to Then, the following holds for (T1) (T2) is nondecreasing (T3) is submodular (Proof) So our approach is that we first prove our objective, the temperature of linear anisotropic diffusion, is a nondecreasing and submodular function. Unfortunately, I don’t think I have enough time to talk about the details of the proof. So, instead, let me give you a short sketch of the proof.

Submodularity on Anisotropic Diffusion
[Theorem] Suppose that system is under linear anisotropic diffusion Let be temperature at time t point x when heat sources are attached to Then, the following holds for (T1) (T2) is nondecreasing (T3) is submodular Induction on distance (Proof) (Diminishing Return) The first and second statement is easy to be proven. The third statement is about submodularity. Here I introduce the definition of submodular function. As many of you already know, it is characterized as a diminishing return property. In our context, suppose that the source placement S1 is a subset of S2. When we add a source to any point in the system, the temperature gain of left hand side is always larger than or equal to that of the right hand side at any point x. We can prove this by induction on the distance of x and the new source point s. x x

Greedy Algorithm Sketch of the greedy algorithm
max s.t Find the point with maximum marginal gain in every round. The greedy algorithm works like this. The idea is that in every round we search for the point with maximum marginal gain. Mathematically, the algorithm looks like this. The marginal gain of a point x means the temperature increase by placing a source to point x. For example, suppose that we have a system and three heat sources. First, we compute the marginal gain at every point, and then place the source at the point with the maximum gain. We iterate this process until we use up all sources. 1. 2. Iterate until 2.1. 2.2. Marginal gain

Diversity Ranking Ranking items according to both centrality and diversity A B Ranking values C The idea of diversity ranking is to rank the items according to both centrality and diversity. In this example, the first raked item is A. No question about it. However, in many real-world problems, the second highest item B is very similar to already-ranked A, then the item C may be better because it is not only highly ranked but also distinctive enough from the item A. Items Centrality only: A > B > C Centrality + Diversity: A > C > B

Optimization for Diversity Ranking
max s.t Simplification (1) System is a graph (2) Steady-state For diversity ranking we first simplify our optimization with some assumptions. First, we assume that our system is a graph. We are interested in the steady state. And, the diffusivity is defined by a Gaussian similarity between the features associated with points. Finally, we assume that every vertex is connected to ground node whose temperature is always 0 with conductance z. (3) Diffusivity is defined by Gaussian similarity (4) Every v is connected to the ground with z

Optimization for Diversity Ranking
max s.t max s.t Simplification (1) System is a graph (2) Steady-state With these assumptions, our optimization is reduced like this. (3) Diffusivity is defined by Gaussian similarity (4) Every v is connected to the ground with z

Examples of Diversity Ranking
(1) vertices (2) features (3) Conductance Input data OK, let me introduce a toy example to show how our diversity ranking works. These data are sampled from three Gaussian distributions. First, we construct the graph G like this. Obviously, every datapoint becomes a vertex, and the conductance is defined by Gaussian similarity between the coordinates of the points. That means if two points are closer each other, then its conductance is higher.

Examples of Diversity Ranking
Marginal gain 1st item 2nd item Input data 3rd item Clustering Given this data, our intuition tells that the center point of the largest blob should be the first ranked one. I draw the marginal gain of each point along the z-axis. As you can see, our intuition is right. Once you put the source to s1, the points near to s1 already get high temperature. Thus, we don’t need to place a source around them. I compute the marginal temperature gain of every point again, and the second ranked item is the center point of the second largest blob. We can keep doing this until we use up all heat sources. This is the final clustering result. As we discussed, the selected points become the cluster centers.

Segmenting a Single Image
Optimization formulation is similar to that of diversity ranking Construct image graph G = (V, E, W) Input image 1. Superpixels (SP) G = (V, E, W) G = (V, E, W) 2. Connect adjacent SPs Basically, the optimization for single image segmentation is almost similar to that of diversity ranking. So, here we only discuss how to construct the graph. Given an image, we first extract superpixels. Then, every superpixel becomes a vertex of the graph. The edge set connects all pairs of adjacent superpixels. We extract color and texture features from each superpixel. Finally, the conductance is computed by Gaussian similarity on these feature vectors. 3. Features on SP 4. Conductance G = (V, E, W) g(v) = Color Texture

Basic Behavior of Our Segmentation
Greedily select the largest and most coherent regions ! Input image K=2: sky K=3: tree K=4: wall K=5: roof K=6: window K=7: building K=8: trash can OK, now let me show an example of single image segmentation. The basic behavior of our segmentation algorithm is to greedily select the largest and most coherent regions one by one. Given this image, the second largest and most coherent region is sky. So, sky is chosen with K of 2. As we increase K, the tree in the center, wall of the house, the roof of the left building, windows, building body, and trash can are segmented in a decreasing order of their sizes and degrees of coherence. This behavior is very useful for automatic selection of K. We can keep increasing K until the detected segment is not significant any more or until the marginal temperature gain is not big any more. And, we make our matlab source code available in our webpage. So if your segmentation goal is well aligned with our objective here, you can try our code. Automatic selection of K Source code is available !

+ Cosegmentation Segment selection should be coupled! Cosegmentation
Single image segmentation Cosegmentation This slide summarizes the main difference between single image segmentation and cosegmentation. For cosegmentation, we need an additional objective, that is, segment selection should be coupled. This objective can be modeled by controlling the source temperature. In the single image segmentation, the source temperature is fixed at one. But now, the first source temperature in image i is a function of feature similarity with its corresponding sources in other images. Objective 1: Segment should be large and coherent. + Objective 2: Segment should be similar to its corresponding ones in other images

Cosegmentation Control source temperatures Cosegmentation A A B B
A is better than B to maximize the temperature Let me give a simple example of why it works. In this example, the pair A is similar each other, but the pair B is not. Therefore, according to the definition, the source temperature of pair A will be much higher than that of pair B. Therefore, A is much better than B for temperature maximization, which is our objective.

An Toy Example of Cosegmentation
MSRC cow images (M=3, K=4) This is a toy example of our cosegmentation. As you can see, two different colored cows, water, and grass are successfully cosegmented. You can get this result by running the demo code of our matlab toolbox. Input images Likelihood Cosegmentation Segments Source code is available !

Two Experiments Figure-ground cosegmentation with a pair of images
Goal: Compare with other state-of-the-art techniques Dataset: MSRC ex. cat Scalable cosegmentation In this paper, we performed two different experiments; one is figure-ground cosegmentation with pair of images. Here the goal is to quantitatively compare our approach with other state-of-the-arts techniques by using MSRC dataset. The second experiment is scalable cosegmentation, which is what we really want to show the feasibility of our method for web photos. Here, we apply our method to each synset of the ImageNet, so the number of images is up to 1000. Goal: Feasibility for Web photos Dataset: ImageNet ex. green lizard

Exp1. Figure-Ground Cosegmentation
Segmentation accuracies for 100 random pairs of MSRC This table summarizes cosegmentation accuracies for experiment 1 with MSRC dataset. We observed that our method is compelling in most objects. [6] ICCV 2009 [7] CVPR 2010 [6, 7] Use their implementation without modification

Cosegmentation on MSRC
Cosegmentation Examples (K=8) This slide shows some examples of cosegmentation on the MSRC. Let’s take a closer look at this three duck example. Here we could make two interesting observations. First, our method can easily detect multiple instances in the image. Second, more interestingly, our algorithm is robust against an incorrect selection of K. Here the best choice of K would be four, which are three ducks and grass. But a faulty choice with K of 8 did little harm. The four significant segments are successfully detected and the other four overestimated ones were trivially selected as tiny dots. On the other hand, if you use normalized cut with K of 8, uniform regions are divided into different segments. Ours (K = 8) Normalized cuts (K = 8) (1) Multiple instances (2) Robust against wrong choice of K

Conclusion What’s done
Prove the temperature in anisotropic diffusion is submodular. Diversity ranking Cosegmentation Single-image segmentation Source code is available ! In this paper, we first proved that the temperature in anisotropic diffusion is submodular. We took advantage of this theoretical result for the problems of diversity ranking, single image setmentation, and cosegmentation. Please try our toolbox if you are interested in any of these applications. Here’s one interesting future direction. Actually, the diffusion model has been widely used in other computer vision applications such as image smoothing or computing optical flow in video. Therefore, our result can also be used for a large-scale image smoothing and layered motion segmentation with little modification. Next step smoothing Optical flow (1) A large-scale edge-preserving image smoothing (2) Layered motion segmentation

Conclusion What’s done Cosegmentation for Web photos was proposed
Arbitrary K and a larger M by order of magnitude Easily Parallelizable Automatic selection of K Robust against wrong K Finally, we proposed a scalable cosegmentation method for web photos. Our approach is unique in terms of K and M, and easily parallelizable, and support automatic selection of K, and robust against wrong choice of K. But, we may need to improve the algorithm so that it can handle the objects that consists of distinctive multiple regions. (Ours) (Ncuts)

Thank you ! Stop by our poster at 80!
Thank you all for listening to my talk. If you have any questions, please stop by our poster at 80.

Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1

Similar presentations

Presentation on theme: "Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1

Similar presentations

Presentation on theme: "Gunhee Kim1 Eric P. Xing1 Li Fei-Fei2 Takeo Kanade1"— Presentation transcript:

Similar presentations

About project

Feedback