Bilinear Classifiers for Visual Recognition

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?

Object recognition and scene “understanding”

A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,

Ľubor Ladický1 Phil Torr2 Andrew Zisserman1

Efficient Large-Scale Structured Learning

Lecture 31: Modern object recognition

Many slides based on P. FelzenszwalbP. Felzenszwalb General object detection with deformable part-based models.

Steerable Part Models Hamed Pirsiavash and Deva Ramanan

Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.

Fast intersection kernel SVMs for Realtime Object Detection

More sliding window detection: Discriminative part-based models Many slides based on P. FelzenszwalbP. Felzenszwalb.

DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan.

1 Transfer Learning Algorithms for Image Classification Ariadna Quattoni MIT, CSAIL Advisors: Michael Collins Trevor Darrell.

吳家宇吳明翰 Face Detection Based on Template Matching and 2DPCA Algorithm 2009/01/14.

Group Norm for Learning Latent Structural SVMs Overview Daozheng Chen (UMD, College Park), Dhruv Batra (TTI Chicago), Bill Freeman (MIT), Micah K. Johnson.

Face Detection and Recognition

Generic object detection with deformable part-based models

Linear Algebra and Image Processing

Hamed Pirsiavash, Deva Ramanan, Charless Fowlkes

Multiclass object recognition

Enhancing Tensor Subspace Learning by Element Rearrangement

Overcoming Dataset Bias: An Unsupervised Domain Adaptation Approach Boqing Gong University of Southern California Joint work with Fei Sha and Kristen Grauman.

1 Preview At least two views are required to access the depth of a scene point and in turn to reconstruct scene structure Multiple views can be obtained.

Week 9 Presented by Christina Peterson. Recognition Accuracies on UCF Sports data set Method Accuracy (%)DivingGolfingKickingLiftingRidingRunningSkating.

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.

A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,

Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.

Sharing Features Between Objects and Their Attributes Sung Ju Hwang 1, Fei Sha 2 and Kristen Grauman 1 1 University of Texas at Austin, 2 University of.

Geodesic Flow Kernel for Unsupervised Domain Adaptation Boqing Gong University of Southern California Joint work with Yuan Shi, Fei Sha, and Kristen Grauman.

Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.

Object detection, deep learning, and R-CNNs

Separating Style and Content with Bilinear Models Joshua B. Tenenbaum, William T. Freeman Computer Examples Barun Singh 25 Feb, 2002.

First-Person Activity Recognition: What Are They Doing to Me? M. S. Ryoo and Larry Matthies Jet Propulsion Laboratory, California Institute of Technology,

Dd Generalized Optimal Kernel-based Ensemble Learning for HS Classification Problems Generalized Optimal Kernel-based Ensemble Learning for HS Classification.

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

More sliding window detection: Discriminative part-based models

Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison March 3, 2016 TexPoint.

Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

Image Retrieval and Ranking using L.S.I and Cross View Learning Sumit Kumar Vivek Gupta

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Cascade for Fast Detection

Object detection with deformable part-based models

Performance of Computer Vision

Boosting and Additive Trees (2)

Lecture 24: Convolutional neural networks

CS 2750: Machine Learning Dimensionality Reduction

Face Recognition and Feature Subspaces

COMP61011 : Machine Learning Ensemble Models

Intelligent Information System Lab

Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.

Model-Based Organ Segmentation: Recent Methods

Outline Multilinear Analysis

Object detection as supervised classification

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Group Norm for Learning Latent Structural SVMs

By: Kevin Yu Ph.D. in Computer Engineering

Computer Vision James Hays

Convolutional Neural Networks for Visual Tracking

Walter J. Scheirer, Samuel E. Anthony, Ken Nakayama & David D. Cox

Object Detection + Deep Learning

Weakly Supervised Action Recognition

Separating Style and Content with Bilinear Models Joshua B

Support Vector Machine I

Artifacts of Adversarial Examples

Separating Style and Content with Bilinear Models Joshua B

Face Recognition: A Convolutional Neural Network Approach

Modeling IDS using hybrid intelligent systems

Presentation transcript:

Bilinear Classifiers for Visual Recognition Hamed Pirsiavash, Deva Ramanan, Charless Fowlkes Computational Vision Lab, University of California Irvine, CA USA Main ideas: -Treat image features and templates as matrices or tensors rather than simply “vectorizing” them -Learn a low rank template by performing alternating minimization using standard linear SVM solver -Share a subset of parameters (subspace) across different problems (transfer learning) Linear SVM: For a given set of training pairs: Experiment 1: Pedestrian detection on INRIA motion dataset f X n ; y g m i n W L ( ) = 1 2 T r + C P a x ; ¡ y X n s = 1 4 £ 6 , f 8 d 5 Regularizer Constraints Was Linear: f ( X ) = T r W > Detection function: Now it is Bilinear: f ( X ) = T r W s > It is biconvex so we use coordinate descent with off-the-shelf linear SVM solver in the loop Learning: Change of basis: ~ W f = A 1 2 T , X n ¡ s m i n ~ W f L ( ; s ) = 1 2 T r + C P a x ¡ y X Experiment 2: Human action recognition on UCF sports dataset A window on sample image Sample template Advantages : -Rank constraint provides natural regularization. -10X runtime speedup with no loss in performance. (Fewer number of convolutions in detection) -Learn shared subspaces across multiple classes or training sets to allow for transfer learning. (e.g., share subspace between “human detector” and “cat detector”) Features Linear (0.518) (Not always feasible) PCA on features (0.444) Bilinear (0.648) 1 x = 2 6 4 . 3 7 5 n y f £ 1 Features Feature vector Vectorizing (traditional) n f n y n f Joint minimization: Our method X = 2 6 4 : . 3 7 5 n y x £ f m i n ~ W f P = 1 L ( ; s ) Feature matrix n x PCA to initialize Experiment 3: Sharing subspace on UCF sports dataset (Using only two training samples for each class) W = s T f Rank restriction: d · m i n ( s ; f ) Train Train Train where n s = y x W 1 s W 2 s Iteration 1 Refined at Iteration 2 W m s n f d Average classification rate 2 6 4 : . 3 7 5 n s £ f = d h i n f Train Train Train Train Coordinate decent iteration 1 Coordinate decent iteration 2 Independent bilinear (C=0.01) 0.222 0.289 Joint bilinear (C=0.1) 0.269 0.356 W f W 1 f W 2 f W m f Related work : -Tenenbaum & Freeman’s bilinear model (Neural Computation’00) -Wolf et al’s low-rank SVM paper (CVPR’07) -Ando and Zhang's transfer learning paper (JMLR'05) W T f W W s Low dim template Subspace W = s T f