Download presentation

Presentation is loading. Please wait.

Published bySharon Hosfield Modified over 2 years ago

1
http://lamda.nju.edu.cn Tighter and Convex Maximum Margin Clustering Yu-Feng Li (LAMDA, Nanjing University, China) (liyf@lamda.nju.edu.cn) Ivor W. Tsang (NTU, Singapore) (IvorTsang@ntu.edu.sg) James T. Kwok (HKUST, Hong Kong) (jamesk@cse.ust.hk) Zhi-Hua Zhou (LAMDA, Nanjing University, China) (zhouzh@lamda.nju.edu.cn)

2
http://lamda.nju.edu.cn Summary Maximum Margin Clustering (MMC) [Xu et al., nips05] –inspired by the success of large margin criterion in SVM –the state-of-the-art performance in many clustering problems. The problem of existing methods –SDP relaxation: global but not scalable –Local search: efficient but non-convex We propose a convex LG-MMC method which is also scalable to large datasets via Label Generation strategy.

3
http://lamda.nju.edu.cn Outline Introduction The Proposed LG-MMC Method Experimental Results Conclusion

4
http://lamda.nju.edu.cn Outline Introduction The Proposed LG-MMC Method Experimental Results Conclusion

5
http://lamda.nju.edu.cn Maximum Margin Clustering [Xu et.al., NIPS05] Perform clustering (i.e., determining the unknown label y) by simultaneously finding maximum margin hyperplane in the data Setting –Given a set of unlabeled pattern Goal –Learn a decision function and a label vector Balance Constraint Margin Error

6
http://lamda.nju.edu.cn Maximum Margin Clustering [Xu et.al., NIPS05] The Dual problem Key –Some kind of relaxation maybe helpful Mixed integer program, intractable for large scale dataset

7
http://lamda.nju.edu.cn Related work MMC with SDP relaxation [Xu et.al., nips05] –convex, state-of-the-art performance –Expensive: the worse-case O(n^6.5) Generalized MMC [Valizadegan & Jin, nips07] –a smaller SDP problem which speedup MMC by 100 times –Still expensive: cannot handle medium datasets Some efficient algorithms [Zhang et.al., icml07][Zhao et.al.,sdm08] –Much more scalable than global methods –Non-convex: may get struck in local minima To investigate a convex method which is also scalable for large datasets

8
http://lamda.nju.edu.cn Outline Introduction The Proposed LG-MMC Method Experiment Results Conclusion

9
http://lamda.nju.edu.cn Intuition ? ? ? ? ? ? ? ? ? ? 1 1 1 1 1 1 1 1 1 1 1 1 1 1 SVM hard efficient combination - Multiple label-kernel learning - yy’ : label-kernel

10
http://lamda.nju.edu.cn Flow Chart of LG-MMC LG-MMC: transform MMC problem to multiple label- kernel learning via minmax relaxation Cutting Plane Algorithm –multiple label-kernel learning –Finding the most violated y LG-MMC achieves tighter relaxation than SDP relaxation [Xu et al., nips05]

11
http://lamda.nju.edu.cn LG-MMC: Minmax relaxation of MMC problem –Consider interchanging the order of and, leading to: –According to the minmax theorem, the optimal objective of LG-MMC is upper bound of that of MMC problem.

12
http://lamda.nju.edu.cn LG-MMC: multiple label-kernel learning Firstly, LG-MMC can be rewritten as: For the inner optimization subproblem, let be the dual variable for each constraint. Its Lagrangian can be obtained as:

13
http://lamda.nju.edu.cn LG-MMC: multiple label-kernel learning (cont.) Setting its derivative w.r.t. to zero, we have Let be the simplex Replace the inner subproblem with its dual and one can have: Similar to single label learning, the above formulation can be regarded as multiple label-kernel learning.

14
http://lamda.nju.edu.cn Cutting Plane Algorithm Problem: Exponential number of possible labeling assignment –the set of base kernels is also exponential in size –direct multiple kernel learning (MKL) is computationally intractable Observation –only a subset of these constraints are active at optimality –cutting-plane method

15
http://lamda.nju.edu.cn Cutting Plane Algorithm 1. Initialize. Find the most violated y and set = {y,−y}. ( is the subset of constraints). 2. Run MKL for the subset of kernel matrices selected in. 3. Find the most violated y and set 4. Repeat steps 2-3 until convergence. How?

16
http://lamda.nju.edu.cn Cutting Plane Algorithm Step2: Multiple Label-Kernel Learning –Suppose that the current working set is –The feature map for the base kernel matrix : SimpleMKL 1. Fix and solve the SVM’s dual 2. Fix and use gradient method for updating 3. Iterate until converge

17
http://lamda.nju.edu.cn Cutting Plane Algorithm Step 3: Finding the most violated y Find the most violated y: Problem: Concave QP Observation: –The cutting plane algorithm only requires the addition of a violated constraint at each iteration –Replace the L2 norm above with infinity-norm

18
http://lamda.nju.edu.cn Cutting Plane Algorithm Step 3: Finding the most violated y Each of these is of the form: –Sort ‘s –Balance constraint

19
http://lamda.nju.edu.cn LG-MMC achieves tighter relaxation Consider the set of all feasible label matrices and two relaxations Convex hull

20
http://lamda.nju.edu.cn LG-MMC achieves tighter relaxation (cont.) Define One can find that –Maximum margin clustering is the same as –LG-MMC problem is the same as –SDP based MMC problem is the same as

21
http://lamda.nju.edu.cn LG-MMC achieves tighter relaxation (cont.) is the convex-hull of, which is the smallest convex set containing. –LG-MMC gives the tightest convex relaxation. It can be shown that is more relaxed than. –SDP MMC is a looser relaxation than the proposed formulation.

22
http://lamda.nju.edu.cn Outline Introduction The Proposed LG-MMC Method Experimental Results Conclusion

23
http://lamda.nju.edu.cn Experiments Data sets 17 UCI dataset MNIST dataset Implementation Matlab 7.6 Evaluation Misclassification error

24
http://lamda.nju.edu.cn Compared Methods k-means –One of most mature baseline methods Normalized Cut [Shi & Malik, PAMI00] –The first spectral based clustering method GMMC [Valizadegan & Jin, nips07] –One of the most efficient global methods for MMC IterSVR [Zhang et.al., icml07] –An efficient algorithm for MMC CPMMC [Zhao et.al., sdm08] –Another state-of-the-art efficient method for MMC

25
http://lamda.nju.edu.cn Clustering Error

26
http://lamda.nju.edu.cn Win-tie-loss Global method vs local method –Global method are better than local method. LG-MMC vs GMMC –LG-MMC is competitive to GMMC method. Win/tie/lossLocal method Global method15/2/2 Win/tie/lossGMMC LG-MMC7/0/3

27
http://lamda.nju.edu.cn Speed LG-MMC is about 10 times faster than GMMC However, In general, local methods are faster than global method.

28
http://lamda.nju.edu.cn Outline Introduction The Proposed LG-MMC Method Experiment Results Conclusion

29
http://lamda.nju.edu.cn Conclusion Main Contribution –In this paper, we propose a scalable and global optimization method for maximum margin clustering –To our best knowledge, it is first time to use label-generation strategy for clustering which might be useful in other domains Further work –In further, we will extend the proposed approach for semi-supervised learning. Thank you

Similar presentations

OK

SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.

SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on regional trade agreements in latin Ppt on any one mathematical quotes Ppt on chromosomes and genes worksheet Ppt on air powered cars Ppt on job evaluation and job rotation Ppt on save water Ppt on nepali culture show Ppt on articles of association for non-profit Ppt on current account deficit by country Ppt on artificial intelligence in electrical engineering