Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

Slides:



Advertisements
Similar presentations
Jan-Michael Frahm, Enrique Dunn Spring 2013
Advertisements

Aggregating local image descriptors into compact codes
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Three things everyone should know to improve object retrieval
Carolina Galleguillos, Brian McFee, Serge Belongie, Gert Lanckriet Computer Science and Engineering Department Electrical and Computer Engineering Department.
The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features Kristen Grauman Trevor Darrell MIT.
Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006 Boosted Histograms for Improved Object Detection.
Many slides based on P. FelzenszwalbP. Felzenszwalb General object detection with deformable part-based models.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
More sliding window detection: Discriminative part-based models Many slides based on P. FelzenszwalbP. Felzenszwalb.
Bag of Features Approach: recent work, using geometric information.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Visual Querying By Color Perceptive Regions Alberto del Bimbo, M. Mugnaini, P. Pala, and F. Turco University of Florence, Italy Pattern Recognition, 1998.
Object Detection using Histograms of Oriented Gradients
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Programme 2pm Introduction –Andrew Zisserman, Chris Williams 2.10pm Overview of the challenge and results –Mark Everingham (Oxford) 2.40pm Session 1: The.
© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,
Global and Efficient Self-Similarity for Object Classification and Detection CVPR 2010 Thomas Deselaers and Vittorio Ferrari.
Generic object detection with deformable part-based models
Salient Object Detection by Composition
Project 2 SIFT Matching by Hierarchical K-means Quantization
Object Recognizing. Recognition -- topics Features Classifiers Example ‘winning’ system.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Efficient Region Search for Object Detection Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas at Austin.
Object Detection with Discriminatively Trained Part Based Models
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Features-based Object Recognition P. Moreels, P. Perona California Institute of Technology.
Locality-constrained Linear Coding for Image Classification
Chao-Yeh Chen and Kristen Grauman University of Texas at Austin Efficient Activity Detection with Max- Subgraph Search.
Answering Similar Region Search Queries Chang Sheng, Yu Zheng.
Efficient Visual Object Tracking with Online Nearest Neighbor Classifier Many slides adapt from Steve Gu.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Sparse Bayesian Learning for Efficient Visual Tracking O. Williams, A. Blake & R. Cipolloa PAMI, Aug Presented by Yuting Qi Machine Learning Reading.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Methods for classification and image representation
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Improved Object Detection
ACADS-SVMConclusions Introduction CMU-MMAC Unsupervised and weakly-supervised discovery of events in video (and audio) Fernando De la Torre.
Recognition Using Visual Phrases
Using Cross-Media Correlation for Scene Detection in Travel Videos.
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Object Recognizing. Object Classes Individual Recognition.
BEYOND SLIDING WINDOW: Object Localization by Efficient Subwindow Search Christoph H. Lampert, Matthew B. Blaschko, and Thomas Hofmann.
Goggle Gist on the Google Phone A Content-based image retrieval system for the Google phone Manu Viswanathan Chin-Kai Chang Ji Hyun Moon.
More sliding window detection: Discriminative part-based models
Recent developments in object detection
Cascade for Fast Detection
Object detection with deformable part-based models
Learning Mid-Level Features For Recognition
Supervised Time Series Pattern Discovery through Local Importance
Lit part of blue dress and shadowed part of white dress are the same color
Nonparametric Semantic Segmentation
Recognition using Nearest Neighbor (or kNN)
Paper Presentation: Shape and Matching
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Object detection as supervised classification
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
The topic discovery models
Outline Background Motivation Proposed Model Experimental Results
RCNN, Fast-RCNN, Faster-RCNN
Presentation transcript:

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008

Motivation To localize the object without exhaustive search observation : often, only a small portion of the image contains the object of interest To find a global optimum in a huge search space Branching and bounding Object detection and retrieval

SVM: Localization problem SVM answers ‘Yes’ or ‘No’ to whether the objects belongs to the classifier’s object class as well as returns confidence score It cannot say where the object is located in the image and at what scale

SVM Object Localization Methods Exhaustive Search. For n x n image complexity is O(n 4 ) Sliding Window Approach

Branch–and–Bound Scheme Branching. Dividing a space of candidate rectangles into subspaces Bounding. Pruning subspaces with a highest possible score lower than some guaranteed score in other subspaces

Bounding function To use branch-and-bound for given quality function f, we need to define upper bound function

Algorithm

Example I. Bag of visual words SVM For every image Extract SIFT image descriptors Quantize descriptors using K-entry codebook of descriptors Represent an image by a histogram of codebook entry occurences every image is coded as 1-dimensional vector h of length K where K is the number of codebook ‘words’

Example I. Bounding function SVM Decision function: We can express it as a sum of per-point contributions with weights If we denote by R max the largest rectangle and by R min the smallest rectangle contained in a parameter region R, then

Example I. Experiment PASCAL VOC 06 5,304 images with 9,507 objects from 10 categories 1000 visual words from 50,000 SURF descriptors claim a match when > 50% overlap between the detected bounding box and the ground truth PASCAL VOC ,963 images with 24,640 objects

Recall Precision Curve

Example II. Spatial Pyramid Kernel SVM

SVM Decision function: We can express it as a sum of per-point contributions with weights The upper bound for f is obtained by summing the bounds for all levels and cells Consider this as Down Sampling

Example II. Experiment UIUC Car database (side-view, one car per image) 1050 training (550 positive images) 277 test (170 single scale multi scale) 1000 visual words from 50,000 SURF descriptors

Example III. Nonlinear More Quality bounds by interval arithmetic

Example III. Experiment keyframes of a movie return 100 most relevant images for a query 2s per returned image

Experiments

Summary Fast Global Optimal Easy to extend (change classifiers, parametric space) Future Kernel-based Classifiers Extensions (groups of boxes, circles …_)