Groups of Adjacent Contour Segments for Object Detection Vittorio Ferrari Loic Fevrier Frederic Jurie Cordelia Schmid.

Slides:



Advertisements
Similar presentations
Classification using intersection kernel SVMs is efficient
Advertisements

Computer Vision Group UC Berkeley How should we combine high level and low level knowledge? Jitendra Malik UC Berkeley Recognition using regions is joint.
Top-Down & Bottom-Up Segmentation
Histograms of Oriented Gradients for Human Detection
Ľubor Ladický1 Phil Torr2 Andrew Zisserman1
Classification using intersection kernel SVMs is efficient Joint work with Subhransu Maji and Alex Berg Jitendra Malik UC Berkeley.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
Yuanlu Xu Human Re-identification: A Survey.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.
Detecting Pedestrians by Learning Shapelet Features
Fast intersection kernel SVMs for Realtime Object Detection
Student: Yao-Sheng Wang Advisor: Prof. Sheng-Jyh Wang ARTICULATED HUMAN DETECTION 1 Department of Electronics Engineering National Chiao Tung University.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Region-based Voting Exemplar 1 Query 1 Exemplar 2.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Detection using Histograms of Oriented Gradients
Local Features and Kernels for Classification of Object Categories J. Zhang --- QMUL UK (INRIA till July 2005) with M. Marszalek and C. Schmid --- INRIA.
Spatial Pyramid Pooling in Deep Convolutional
On the Object Proposal Presented by Yao Lu
Learning Spatial Context: Can stuff help us find things? Geremy Heitz Daphne Koller April 14, 2008 DAGS Stuff (n): Material defined by a homogeneous or.
What, Where & How Many? Combining Object Detectors and CRFs
Global and Efficient Self-Similarity for Object Classification and Detection CVPR 2010 Thomas Deselaers and Vittorio Ferrari.
Generic object detection with deformable part-based models
Introduction --Classification Shape ContourRegion Structural Syntactic Graph Tree Model-driven Data-driven Perimeter Compactness Eccentricity.
Learning deformable shape models from images. Goal: localize boundaries of new class instances Training data Training: bounding-boxes Testing: object.
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Recognition and Matching based on local invariant features Cordelia Schmid INRIA, Grenoble David Lowe Univ. of British Columbia.
Object Detection Sliding Window Based Approach Context Helps
Local invariant features Cordelia Schmid INRIA, Grenoble.
Professor: S. J. Wang Student : Y. S. Wang
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Efficient Region Search for Object Detection Sudheendra Vijayanarasimhan and Kristen Grauman Department of Computer Science, University of Texas at Austin.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Pedestrian Detection and Localization
BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Local invariant features Cordelia Schmid INRIA, Grenoble.
Histograms of Oriented Gradients for Human Detection(HOG)
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Object Detection Overview Viola-Jones Dalal-Triggs Deformable models Deep learning.
Recognition Using Visual Phrases
Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Categorical Perception 강우현. Introduction Scalable representations for visual categorization Representation for functional and affordance-based categorization.
Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.
CS 2750: Machine Learning Bias-Variance Trade-off (cont’d) + Image Representations Prof. Adriana Kovashka University of Pittsburgh January 20, 2016.
More sliding window detection: Discriminative part-based models
In: Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 30, Nr. 1 (2008), p Group of Adjacent Contour Segments for Object Detection.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Cascade for Fast Detection
Object detection with deformable part-based models
TP12 - Local features: detection and description
Learning Mid-Level Features For Recognition
Lit part of blue dress and shadowed part of white dress are the same color
Paper Presentation: Shape and Matching
Object detection as supervised classification
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
A Tutorial on HOG Human Detection
An HOG-LBP Human Detector with Partial Occlusion Handling
RCNN, Fast-RCNN, Faster-RCNN
Presentation transcript:

Groups of Adjacent Contour Segments for Object Detection Vittorio Ferrari Loic Fevrier Frederic Jurie Cordelia Schmid

Problem: object class detection & localization Training Testing ? Focus: classes with characteristic shape

Features: pairs of adjacent segments (PAS) Contour segment network [Ferrari et al. ECCV 2006] 1)edgels extracted with Berkeley boundary detector 2) edgel-chains partitioned into straight contour segments 3) segments connected at edgel- chains’ endpoints and junctions

Features: pairs of adjacent segments (PAS) segments connected in the network PAS = groups of two connected segments encodes geometric properties of the PAS scale and translation invariant compact, 5D PAS descriptor:

Features: pairs of adjacent segments (PAS) Example PAS Why PAS ? + intermediate complexity: good repeatability- informativeness trade-off + scale-translation invariant + connected: natural grouping criterion (need not choose a grouping neighborhood or scale) + can cover pure portions of the object boundary

PAS codebook Based on descriptors, cluster PAS into types a few of the most frequent types based on 10 outdoor images (5 horses and 5 background). types based on 15 indoor images (bottles) Frequently occurring PAS have intuitive, natural shapes As we add images, number of PAS types converges to just ~100 Very similar codebooks come out, regardless of source images + general, simple features. We use a single, universal codebook (1 st row) for all classes

Window descriptor 1. Subdivide window into tiles. 2. Compute a separate bag of PAS per tile 3. Concatenate these semi-local bags [Lazebnik et al. CVPR 2006]; [Dalal and Triggs CVPR 2005] + distinctive: records which PAS appear where weight PAS by average edge strength + flexible: soft-assign PAS to types rather coarse tiling + fast to compute using Integral Histograms

Training 1. Learn mean positive window dimensions 2. Determine number of tiles T 3. Collect positive example descriptors 4. Collect negative example descriptors: slide window over negative training images

Training 5. Train a linear SVM Here a few of the top weighted descriptor vector dimensions (= 'PAS + tile'): + lie on object boundary (= local shape structure common to many training examples)

Testing 1. Slide window of aspect ratio, at multiple scales 2. SVM classify each window + non-maxima suppression detections

Results – INRIA horses + tiling brings a substantial improvement optimum at T=30 -> keep this setting on all other experiments + works well: 86% det-rate at 0.3 FPPI (with 50 pos + 50 neg training images) Dataset: ~ Jurie and Schmid, CVPR positive negative images (training = 50 pos + 50 neg) wide range of scales; clutter (missed and FP)

Results – INRIA horses Dataset: ~ Jurie and Schmid, CVPR positive negative images (training = 50 pos + 50 neg) wide range of scales; clutter + PAS better than any IP all interest point (IP) comparisons with T=10, and 120 feature types, (= optimum over INRIA horses, and ETHZ Shape Classes; all IP codebooks are class-specific) (missed and FP)

Results – Weizmann-Shotton horses Dataset: Shotton et al., ICCV positive negative images (training = 50 pos + 50 neg) no scale changes; modest clutter Shotton’s EER - exact comparison to Shotton et al.: use their images and search at a single scale - PAS same performance (~92% precision-recall EER), but: + no need for segmented training images (only bounding-boxes) + can detect objects at multiple scales (see other experiments)

Results – ETHZ Shape Classes Dataset: Ferrari et al., ECCV images, over 5 classes training = half of positive images for a class + same number from the other classes (1/4 from each) testing = all other images large scale changes; extensive clutter

Results – ETHZ Shape Classes Dataset: Ferrari et al., ECCV images, over 5 classes training = half of positive images for a class + same number from the other classes (1/4 from each) testing = all other images large scale changes; extensive clutter Missed

Results – ETHZ Shape Classes + mean det-rate at 0.4 FPPI = 79% + PAS >> I.P for apple logos, bottles, mugs PAS ~= IP for giraffes (texture!) PAS < IP for swan + overall best IP: Harris-Laplace + class specific IP codebooks GiraffesMugsSwans Apple logosBottles

Results – Caltech 101 Dataset: Fei-Fei et al., GMBV anchor, 62 chair, 67 cup images train = half + same number of caltech101 background testing = other half pos + same number of background scale changes; only little clutter

Results – Caltech 101 Dataset: Fei-Fei et al., GMBV 2004 On caltech101’s anchor, chair, cup: + PAS better than Harris-Laplace + mean PAS det-rate at 0.4 FPPI: 85%

Comparison to Dalal and Triggs CVPR 2005 GiraffesMugsSwans Apple logos Bottles

Comparison to Dalal and Triggs CVPR 2005 Caltech anchorsCaltech chairsCaltech cups INRIA horses Shotton horses + overall mean det-rate at 0.4 FPPI: PAS 82% >> HoG 58% PAS >> HoG for 6 datasets PAS ~= HoG for 2 datasets PAS < HoG for 2 datasets

Generalizing PAS to kAS kAS: any path of length k through the contour segment network segments connected in the network 3AS 4AS scale+translation invariant descriptor with dimensionality 4k-2 k = feature complexity; higher k -> more informative, but less repeatable kAS overall mean det-rates (%) 1AS PAS 3AS 4AS 0.3 FPPI FPPI PAS do best !

Conclusions Connected local shape features for object class detection Experiments on 10 diverse classes from 4 datasets show: + better suited than interest points for these shape-based classes - fixed aspect-ratio window: sometimes inaccurate bounding-boxes + object detector deals with clutter, scale changes, intra-class variability - single viewpoint + PAS have the best intermediate complexity among kAS + object detector compares favorably to HoG-based one

Current work: detecting object outlines Training: learn the common boundaries from examples Model collection of PAS and their spatial variability only common boundary

1. detect edges Current work: detecting object outlines Detection on a new image 2. match PAS based on descriptors 3. vote for translation + scale initializations 4. match deformable thin-plate spline based on deterministic annealing Outline object in test image, without segmented training images !

A few preliminary results

Results – Caltech 101 Dataset: Fei-Fei et al., GMBV 2004 On caltech101’s anchor, chair, cup: + PAS better than any IP + mean PAS det-rate at 0.4 FPPI: 85%