Robust Subspace Discovery: Low-rank and Max-margin Approaches Xiang Bai Joint works with Xinggang Wang, Zhengdong Zhang, Zhuowen Tu, Yi Ma and Wenyu Liu.

Slides:



Advertisements
Similar presentations
Semantic Contours from Inverse Detectors Bharath Hariharan et.al. (ICCV-11)
Advertisements

Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Automatic Image Annotation Using Group Sparsity
Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Simultaneous Image Classification and Annotation Chong Wang, David Blei, Li Fei-Fei Computer Science Department Princeton University Published in CVPR.
Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Lecture 31: Modern object recognition
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
1 Building a Dictionary of Image Fragments Zicheng Liao Ali Farhadi Yang Wang Ian Endres David Forsyth Department of Computer Science, University of Illinois.
Ilias Theodorakopoulos PhD Candidate
Transferable Dictionary Pair based Cross-view Action Recognition Lin Hong.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Large-Scale Object Recognition with Weak Supervision
Image classification by sparse coding.
Jun Zhu Dept. of Comp. Sci. & Tech., Tsinghua University This work was done when I was a visiting researcher at CMU. Joint.
Discriminative and generative methods for bags of features
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Multiple Instance Learning
1 Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Lecture 28: Bag-of-words models
An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.
Support Vector Machines Pattern Recognition Sergios Theodoridis Konstantinos Koutroumbas Second Edition A Tutorial on Support Vector Machines for Pattern.
Bag-of-features models
Unsupervised discovery of visual object class hierarchies Josef Sivic (INRIA / ENS), Bryan Russell (MIT), Andrew Zisserman (Oxford), Alyosha Efros (CMU)
5/30/2006EE 148, Spring Visual Categorization with Bags of Keypoints Gabriella Csurka Christopher R. Dance Lixin Fan Jutta Willamowski Cedric Bray.
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Bag of Video-Words Video Representation
Presented By Wanchen Lu 2/25/2013
CSE 185 Introduction to Computer Vision Pattern Recognition.
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
Object Bank Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 4 th, 2013.
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
SVM-KNN Discriminative Nearest Neighbor Classification for Visual Category Recognition Hao Zhang, Alex Berg, Michael Maire, Jitendra Malik.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Deformable Part Model Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 11 st, 2013.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
In Defense of Nearest-Neighbor Based Image Classification Oren Boiman The Weizmann Institute of Science Rehovot, ISRAEL Eli Shechtman Adobe Systems Inc.
Extending the Multi- Instance Problem to Model Instance Collaboration Anjali Koppal Advanced Machine Learning December 11, 2007.
Introduction to Video Background Subtraction 1. Motivation In video action analysis, there are many popular applications like surveillance for security,
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The Twenty-Third IEEE Conference on Computer Vision and.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
Multiple Instance Learning for Sparse Positive Bags Razvan C. Bunescu Machine Learning Group Department of Computer Sciences University of Texas at Austin.
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
CoNMF: Exploiting User Comments for Clustering Web2.0 Items Presenter: He Xiangnan 28 June School of Computing National.
ACADS-SVMConclusions Introduction CMU-MMAC Unsupervised and weakly-supervised discovery of events in video (and audio) Fernando De la Torre.
Ariadna Quattoni Xavier Carreras An Efficient Projection for l 1,∞ Regularization Michael Collins Trevor Darrell MIT CSAIL.
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Unsupervised Streaming Feature Selection in Social Media
Max-Margin Training of Upstream Scene Understanding Models Jun Zhu Carnegie Mellon University Joint work with Li-Jia Li *, Li Fei-Fei *, and Eric P. Xing.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Compact Bilinear Pooling
Object detection with deformable part-based models
An Additive Latent Feature Model
Learning Mid-Level Features For Recognition
Outline S. C. Zhu, X. Liu, and Y. Wu, “Exploring Texture Ensembles by Efficient Markov Chain Monte Carlo”, IEEE Transactions On Pattern Analysis And Machine.
a chicken and egg problem…
Outline Background Motivation Proposed Model Experimental Results
Presentation transcript:

Robust Subspace Discovery: Low-rank and Max-margin Approaches Xiang Bai Joint works with Xinggang Wang, Zhengdong Zhang, Zhuowen Tu, Yi Ma and Wenyu Liu 1

Outline Low-rank Approach – Subspace discovery via low-rank optimization (in ACCV 2012 & submitted to Neural Computation) Max-margin Approach – Max-margin multi-instance learning for dictionary learning (in ICML 2013) 2

Introduction 3 Given a set of images from the same object category, we want to automatically discovery the objects. Low-rank approach: assume the objects lie on a low-rank subspace. Max-margin approach: learning discriminative functions to classify objects and background. Applications Object detection in a weakly supervised way Learning mid-level image representation Codebook learning

Low-Rank Approach 4

Notations In a MIL framework 5 Instances in a bag (given): Instance labels (unknown): All bags are positive:

Assumption 6 Common objects Random patches Low-rank part of Random patches

Subspace discovery via low-rank optimization 7 The objective function

Subspace discovery via low-rank optimization 8

9 A Naive Iterative Solution (NIM): [1] E. Candes, X. Li, Y. Ma, and J. Wright. Robust principal component analysis? Journal of the ACM, 58(3), May 2011.

Subspace discovery via low-rank optimization 10

Subspace discovery via low-rank optimization 11 Solution via Inexact ALM (IALM), also called as Alternating Direction Method of Multipliers (ADMM) The Augmented Lagrangian function:

Subspace discovery via low-rank optimization 12 Solution via Inexact ALM (IALM)

Subspace discovery via low-rank optimization 13

Subspace discovery via low-rank optimization 14 Robust subspace learning simulation In this experiment, we generate synthetic data with 50 bags; in each bag there are 10 instances which include 1 positive instance and 9 negative instances; dimension of instance is d = 500. Working range of IALM and NIM: Precision of the recovered indicators when the sparsity level and the rank vary for both IALM (left) and NIM (right).

Subspace discovery via low-rank optimization 15 Aligned face discovery among random image patches A bag with face image and random patches Low-rank partError part The 165 face from Yale dataset images are in 165 bags; other than the face image, in each bag, there are 9 image patches from PASCAL dataset Accuracies of IALM and NIM (randomly initialized) are 99.5±0.5% and 77.8±3.5%

Subspace discovery via low-rank optimization 16 Object discovery on FDDB subset It contains 440 face images from FDDB dataset [1] [1] V. Jain and E. Learned-Miller. Fddb: A benchmark for face detection in nconstrained settings. Technical Report UM-CS , University of Massachusetts, Amherst, [2] J. Zhu, J. Wu, Y. Wei, E. Chang, and Z. Tu. Unsupervised object class discovery via saliency- guided multiple class learning. In IEEE Conference on Computer Vision and Pattern Recognition, METHODAverage precision Saliency detection (SD)0.148 bMCL [2]0.619 NIM-SD0.671 NIM-RAND0.669 IALM0.745

Subspace discovery via low-rank optimization 17 Object discovery on FDDB subset Face discovery results. The windows with maximal score given by SD (Feng et al., 2011) (in cyan), bMCL (Zhu et al., 2012) (in green), NIM-SD (in blue) and IALM (in red) on FDDB subset (Jain & Learned-Miller, 2010).

Subspace discovery via low-rank optimization 18 PASCAL dataset Object discovery performance evaluated by CorLoc on PASCAL 2006 and 2007 data sets. We follow the setting in Deselaser et al. (2012).

Subspace discovery via low-rank optimization 19 PASCAL dataset Red rectangles: object discovery results of IALM (from top to bottom: aeroplane, bicycle, bus, motorbike, plotted-plants and tv-monitors) on the challenging PASCAL Green rectangles: annotated object ground-truth.

Subspace discovery via low-rank optimization 20 Tumor discovery

Subspace discovery via low-rank optimization 21 Multiple instance learning

22 Conclusions for Low-rank Approach It is robust to sparse error and overwhelming outliers. It has a convex solution and insensitive to initialization of the algorithm. We use the IALM (ADMM) algorithm to solve a combinatorial problem. In real applications, it is effective, but fails to get the state-of-the-art performance.

Max-margin Approach

24 Dictionary learning Literature Sparse coding (Olshausen and Field, 1996) Latent Dirichlet Allocation (Blei et al., 2003) Bag of words (Blei et al., 2003) Deep Belief Nets (Hinton et al., 2006) It is an important component for building effective and efficient representation It is effective for many machine learning problems 1.Explicit representations are often enforced; 2.dimensionality reduction is performed through quantization; 3.it facilitates hierarchical representations; 4.spatial configuration can be also imposed.

25 Unsupervised and supervised dictionary learning Unsupervised codebook learning: kmeans (Duda et al., 2000). Supervised codebook learning: ERC-Forests (Moosmann et al., 2008) Kmeans for dictionary learning ERC-forest for dictionary learning face, flowers, building

26 Weakly-supervised dictionary learning Building Image Combined Response Flower negatives

27 Multiple-instance Max-margin dictionary learning (MMDL) Exploring sematic information using multiple-instance learning (MIL) (Dietterich et al.,1997). Assuming positive instances are drawn from multiple clusters. Directly maximizing margins between different clusters. Using the cluster classifiers as the codebook.

28 Formulation  MIL notations:  Generalized code (G-code):  Latent variable (indicator) for instance:

29 Formulation  The objective function of MMDL: regularization termloss function Crammer & Singer SVM

30 Learning Strategies

31 Image Representation

32 Experiments Features: LBP, HoG, encoded SIFT, LAB histogram, GIST. Average classification accuracies of different methods comparison on 15 Scene over different number of codewords and different types of feature.

33 Experiments  15 Scene MethodsAccuracy (%)#(codewords) Object Bank (Li et al., 2010) Lazebnik et al Yang et al Kernel Desp. (Bo et al. 2010) Ours  UIUC Sports dataset MethodsAccuracy (%) Li & Fei-Fei, Wu& Rehg, Object Bank (Li et al., 2010)76.3 SPMSM (Kwitt et al., 2012)83.0 LPR (Sadeghi & Tappen, 2012)86.25 Ours (88 codes)88.47  MIT 67 Indoor MethodsAccuracy (%) ROI+GIST (Quattoni & Torralba, 2009)26.5 RBOW (Parizi et al., 2012)37.93 Disc. Patches (Sigh et al., 2012)38.1 SPMSM (Kwitt et al., 2012)44.0 LPR (Sadeghi & Tappen, 2012)44.84 Ours (737 codes)50.15

34 Experiments  MIT 67 Indoor Some meaningful clusters learned by MMDL for different categories. Each row illustrates a cluster model: red rectangles shows positions of G-code classier red where SVM function value is bigger than zero.

35 Conclusions of Max-margin Approach MMDL can naturally learn a metric to take the advantage of multiple features. The max-margin formulation leads to very compact code for image representation with the state-of- the-art image classification performance. The MIL strategy can learn codebook contains rich semantic information.

36 Thanks!