Object Detection with Discriminatively Trained Part Based Models

Slides:

Advertisements

Similar presentations

Presenter: Duan Tran (Part of slides are from Pedro’s)

Advertisements

Learning Shared Body Plans Ian Endres University of Illinois work with Derek Hoiem, Vivek Srikumar and Ming-Wei Chang.

Pose Estimation and Segmentation of People in 3D Movies Karteek Alahari, Guillaume Seguin, Josef Sivic, Ivan Laptev Inria, Ecole Normale Superieure ICCV.

Jan-Michael Frahm, Enrique Dunn Spring 2013

Detecting Faces in Images: A Survey

Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.

Face Alignment with Part-Based Modeling

Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006 Boosted Histograms for Improved Object Detection.

Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.

Lecture 31: Modern object recognition

LPP-HOG: A New Local Image Descriptor for Fast Human Detection Andy Qing Jun Wang and Ru Bo Zhang IEEE International Symposium.

Many slides based on P. FelzenszwalbP. Felzenszwalb General object detection with deformable part-based models.

Presenter: Hoang, Van Dung

Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.

Steerable Part Models Hamed Pirsiavash and Deva Ramanan

Intro to DPM By Zhangliliang. Outline Intuition Introduction to DPM Model Inference(matching) Training latent SVM Training Procedure Initialization Post-processing.

Instructor: Mircea Nicolescu Lecture 15 CS 485 / 685 Computer Vision.

More sliding window detection: Discriminative part-based models Many slides based on P. FelzenszwalbP. Felzenszwalb.

Student: Yao-Sheng Wang Advisor: Prof. Sheng-Jyh Wang ARTICULATED HUMAN DETECTION 1 Department of Electronics Engineering National Chiao Tung University.

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

Object Recognizing We will discuss: Features Classifiers Example ‘winning’ system.

Object Recognition Using Geometric Hashing

Scale Invariant Feature Transform (SIFT)

The SIFT (Scale Invariant Feature Transform) Detector and Descriptor

Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.

Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.

Generic object detection with deformable part-based models

Object Recognizing. Object Classes Individual Recognition.

Object Recognizing. Recognition -- topics Features Classifiers Example ‘winning’ system.

“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)

Marco Pedersoli, Jordi Gonzàlez, Xu Hu, and Xavier Roca

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

Pedestrian Detection and Localization

Deformable Part Model Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 11 st, 2013.

Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.

Sampling for Part Based Object Models Daniel Huttenlocher September, 2006.

Training and Evaluating of Object Bank Models Presenter ： Changyu Liu Advisor ： Prof. Alex Interest ： Multimedia Analysis May 16 th, 2013.

Object detection, deep learning, and R-CNNs

Histograms of Oriented Gradients for Human Detection(HOG)

CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.

Object Detection Overview Viola-Jones Dalal-Triggs Deformable models Deep learning.

Pictorial Structures and Distance Transforms Computer Vision CS 543 / ECE 549 University of Illinois Ian Endres 03/31/11.

Category Independent Region Proposals Ian Endres and Derek Hoiem University of Illinois at Urbana-Champaign.

Improved Object Detection

Recognition Using Visual Phrases

Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.

Object Recognition as Ranking Holistic Figure-Ground Hypotheses Fuxin Li and Joao Carreira and Cristian Sminchisescu 1.

Object Recognizing. Object Classes Individual Recognition.

CS 2750: Machine Learning Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh February 17, 2016.

More sliding window detection: Discriminative part-based models

Object Recognizing. Object Classes Individual Recognition.

A Discriminatively Trained, Multiscale, Deformable Part Model Yeong-Jun Cho Computer Vision and Pattern Recognition,2008.

Fine-grained Fine-grained Recognition( 细粒度分类 ) 沈志强.

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Recognizing specific objects Matching with SIFT Original suggestion Lowe, 1999,2004.

Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.

1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Recent developments in object detection

Object detection with deformable part-based models

Object detection, deep learning, and R-CNNs

Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Group Norm for Learning Latent Structural SVMs

A Tutorial on HOG Human Detection

“The Truth About Cats And Dogs”

Outline Background Motivation Proposed Model Experimental Results

Feature descriptors and matching

Presentation transcript:

Object Detection with Discriminatively Trained Part Based Models Pedro F. Felzenszwalb, Ross B. Girshick, David McAllester and Deva Ramanan IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1627-1645, September 2010

Overview Motivation Introduction Proposed Method Experiment Result Deformable Part Models Latent SVM Training Models Experiment Result Conclusion

Motivation Object category detection: Detect all objects with the same category in an image For example horse detection:

Motivation Objects in rich categories exhibit significant variability Viewpoint variation Intra-class variability bicycles of different types (e.g., mountain bikes, tandems...) People wear different clothes and take different poses

Introduction Part Model Mixture Model Histogram of Gradient Feature Pyramid Support Vector Machine

Part Model Definition: Displacement : Root : Catch Roughly appearance of object Part : Catch local appearance of object Spring : spatial connections between parts Displacement : Using minimizing energy function to find the optimal displacement [1] Pictorial Structures for Object Recognition, Daniel P. Huttenlocher,1973

Mixture Model We build a Mixture Model which include different components of the same class

Mixture Model Component 1 Component 2

Histogram of Gradient Step: 1.Compute gradient every pixel 2.Group 8*8 pixels into a cell, and 4*4 cells into a block. Build histogram of each cells about 9 orientation bins (00~1800)

Histogram of Gradient Step: 3.A block contain 4*9 feature vector for local information, and a bounding box which contain n’s blocks has n*4*9 feature vector 4.Training by SVM

Feature Pyramid Develop a representation to decompose images into multiple scales by smoothing and subsampling to extract features of interest and avoid noise Test image Human model

Feature Pyramid Subsampling and smoothing: Gaussian pyramid

Support Vector Machine Build a hyper plane separate positive examples from negative In this case, we want the score is positive when human exist in the bounding box

Proposed Method Deformable Part Models Latent SVM Training Models Build Models Matching Mixture Models Latent SVM Training Models

Deformable Part Models Build a model for an object with n parts : root filters coarse resolution part filters finer resolution deformation models real-valued bias term root filter a model for the i-th part (Fi, vi, di) coefficients of a quadratic function a filter for the i-th part for part i relative to the root position

Deformable Part Models Object hypothesis: p0 : location of root p1,..., pn : location of parts

Deformable Part Models Part filters are placed at twice the spatial resolution of the placement of the root z specifies the location of each filter in feature pyramid Pi specifies the level and position of the ith filter

Deformable Part Models Score of hypothesis= sum of filter score - deformation cost displacements filters deformation parameters feature map

Deformable Part Models Rewrite the equation:

Deformable Part Models Given a root position find the best placement of parts: Using sliding window approach, high score of root score define detections

Matching process

Matching Response of filter in l-th pyramid level Transformed response: finding best part displacement relate to root

Matching Overall root scores :

Mixture Models A mixture model with m components 1<=c<=m

Mixture Models Detect objects using a mixture model, we use matching algorithm to find root positions independently for each component

Latent SVM(LSVM) Classifiers that score an example x (bounding box): β are model parameters, z are latent values We want fB(x)>0 when positive fB(x)<0 when negative

Latent SVM(LSVM) Minimize : Hinge loss

Latent SVM(LSVM) Semi-convex: is convex for negative examples for positive examples, convex if latent values fixed Solution fixed latent values by coordinate decent:

Training Models We initial k component with a specific class, sort the bounding boxes by aspect ratio and intraclass variation then split into k group Initial root filters and use coordinate decent to update Initial part filters by greedily place parts to cover high energy regions of the root filter Training by SVM

Experiment Result PASCAL VOC 2006,2007,2008 comp3 challenge datasets Some statistics: It takes 2 seconds to evaluate a model in one image (4952 images in the test dataset) It takes 4 hours to train a model MUCH faster than most systems. All of the experiments were done on a 2.8Ghz 8-core Intel Xeon Mac Pro computer running Mac OS X 10.5.

Experiment Result Measurement: predicted bounding box is correct if it overlaps more than 50 percent with ground truth bounding box; otherwise, considered false positive

Experiment Result

Experiment Result

Experiment Result Best Average Precision score in 9 out of 20, second in 8

Experiment Result

Conclusion Deformable Part Model Fast matching algorithm handle Viewpoint variation, and Intra-class variability problems Still have some problem need to solve: Fixed box size Fixed number of components Future Work Build grammar based models that represent objects with variable hierarchical structures Sharing part models between components