A Statistical Method for 3D Object Detection Applied to Face and Cars CVPR 2000 Henry Schneiderman and Takeo Kanade Robotics Institute, Carnegie Mellon.

Slides:



Advertisements
Similar presentations
Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.
Advertisements

Object Detection Using Semi- Naïve Bayes to Model Sparse Structure Henry Schneiderman Robotics Institute Carnegie Mellon University.
Detecting Faces in Images: A Survey
Human Identity Recognition in Aerial Images Omar Oreifej Ramin Mehran Mubarak Shah CVPR 2010, June Computer Vision Lab of UCF.
University of Ioannina - Department of Computer Science Wavelets and Multiresolution Processing (Background) Christophoros Nikou Digital.
- Recovering Human Body Configurations: Combining Segmentation and Recognition (CVPR’04) Greg Mori, Xiaofeng Ren, Alexei A. Efros and Jitendra Malik -
Weiwei Zhang, Jian Sun, and Xiaoou Tang, Fellow, IEEE.
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
What Makes a Patch Distinct?
Error detection and concealment for Multimedia Communications Senior Design Fall 06 and Spring 07.
Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA
Detecting Pedestrians by Learning Shapelet Features
1 Content-Based Retrieval (CBR) -in multimedia systems Presented by: Chao Cai Date: March 28, 2006 C SC 561.
Young Deok Chun, Nam Chul Kim, Member, IEEE, and Ick Hoon Jang, Member, IEEE IEEE TRANSACTIONS ON MULTIMEDIA,OCTOBER 2008.
Lecture 5 Template matching
Region labelling Giving a region a name. Image Processing and Computer Vision: 62 Introduction Region detection isolated regions Region description properties.
Real-time Embedded Face Recognition for Smart Home Fei Zuo, Student Member, IEEE, Peter H. N. de With, Senior Member, IEEE.
Multiresolution Analysis (Section 7.1) CS474/674 – Prof. Bebis.
Ensemble Tracking Shai Avidan IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE February 2007.
Multiple Human Objects Tracking in Crowded Scenes Yao-Te Tsai, Huang-Chia Shih, and Chung-Lin Huang Dept. of EE, NTHU International Conference on Pattern.
Project 4 out today –help session today –photo session today Project 2 winners Announcements.
Texture Classification Using QMF Bank-Based Sub-band Decomposition A. Kundu J.L. Chen Carole BakhosEvan Kastner Dave AbramsTommy Keane Rochester Institute.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
Basics: Notation: Sum:. PARAMETERS MEAN: Sample Variance: Standard Deviation: * the statistical average * the central tendency * the spread of the values.
CS292 Computational Vision and Language Visual Features - Colour and Texture.
Wavelet-based Coding And its application in JPEG2000 Monia Ghobadi CSC561 project
Computer Vision I Instructor: Prof. Ko Nishino. Today How do we recognize objects in images?
Smart Traveller with Visual Translator for OCR and Face Recognition LYU0203 FYP.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Object Detection Using the Statistics of Parts Henry Schneiderman Takeo Kanade Presented by : Sameer Shirdhonkar December 11, 2003.
Despeckle Filtering in Medical Ultrasound Imaging
The Digital Image.
Shape-Based Human Detection and Segmentation via Hierarchical Part- Template Matching Zhe Lin, Member, IEEE Larry S. Davis, Fellow, IEEE IEEE TRANSACTIONS.
A General Framework for Tracking Multiple People from a Moving Camera
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Object Detection Using the Statistics of Parts Presented by Nicholas Chan – Advanced Perception Robust Real-time Object Detection Henry Schneiderman.
Automated Face Detection Peter Brende David Black-Schaffer Veni Bourakov.
Vehicle License Plate Detection Algorithm Based on Statistical Characteristics in HSI Color Model Instructor : Dr. K. R. Rao Presented by: Prasanna Venkatesh.
Texture. Texture is an innate property of all surfaces (clouds, trees, bricks, hair etc…). It refers to visual patterns of homogeneity and does not result.
Multimodal Information Analysis for Emotion Recognition
Image Processing Xuejin Chen Ref:
Wavelet-based Coding And its application in JPEG2000 Monia Ghobadi CSC561 final project
Object Detection with Discriminatively Trained Part Based Models
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
A Two-level Pose Estimation Framework Using Majority Voting of Gabor Wavelets and Bunch Graph Analysis J. Wu, J. M. Pedersen, D. Putthividhya, D. Norgaard,
Team 5 Wavelets for Image Fusion Xiaofeng “Sam” Fan Jiangtao “Willy” Kuang Jason “Jingsu” West.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
1 Research Question  Can a vision-based mobile robot  with limited computation and memory,  and rapidly varying camera positions,  operate autonomously.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Colour and Texture. Extract 3-D information Using Vision Extract 3-D information for performing certain tasks such as manipulation, navigation, and recognition.
Content Based Color Image Retrieval vi Wavelet Transformations Information Retrieval Class Presentation May 2, 2012 Author: Mrs. Y.M. Latha Presenter:
Face Detection Using Neural Network By Kamaljeet Verma ( ) Akshay Ukey ( )
CS332 Visual Processing Department of Computer Science Wellesley College High-Level Vision Face Recognition I.
Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.
Face Detection and Head Tracking Ying Wu Electrical Engineering & Computer Science Northwestern University, Evanston, IL
A REAL-TIME DEFORMABLE DETECTOR 謝汝欣 OUTLINE  Introduction  Related Work  Proposed Method  Experiments 2.
Presenter : r 余芝融 1 EE lab.530. Overview  Introduction to image compression  Wavelet transform concepts  Subband Coding  Haar Wavelet  Embedded.
Computer Vision: 3D Shape Reconstruction Use images to build 3D model of object or site 3D site model built from laser range scans collected by CMU autonomous.
Multiresolution Analysis (Section 7.1) CS474/674 – Prof. Bebis.
Journal of Vision. 2009;9(3):5. doi: /9.3.5 Figure Legend:
Presented by Minh Hoai Nguyen Date: 28 March 2007
Object Recognition in the Dynamic Link Architecture
A Tutorial on HOG Human Detection
Image Segmentation Techniques
Lecture 26: Faces and probabilities
Brief Review of Recognition + Context
Announcements Project 4 out today Project 2 winners help session today
Multi-Information Based GCPs Selection Method
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

A Statistical Method for 3D Object Detection Applied to Face and Cars CVPR 2000 Henry Schneiderman and Takeo Kanade Robotics Institute, Carnegie Mellon University Presented by Zhiming Liu

Outline Introduction Functional Form of Decision Rule Implementation of Detectors: Coarse to Fine Search Strategy Experiments

Introduction The main challenge in object detection: the amount of variation in visual appearance - cars vary in shape, size, color and in small details. - lighting variation. - pose and orientation variation. View-Based Detectors Separate detectors that are each specialized to a specific orientation of the object.

Introduction The number of orientations to model for each object is empirically determined.

Functional Form of Decision Rule For each view-based detector, we model two statistical distributions, P(image | object) and P(image | non-object)

Functional Form of Decision Rule Representation of Statistics Using Products of Histograms - difficulty in modeling P(image | object) and P(image | non-object). - choose to use histogram. Estimation of a histogram simply involves counting how often each attribute value occurs in the training data. - use multiple histograms. Each histogram, P k (pattern k | object), represents the probability of appearance over some specified visual attribute, pattern k.

Functional Form of Decision Rule - combine probabilities from different attributes.

Functional Form of Decision Rule Decomposition of Appearance in Space, Frequency and Orientation - first, we decompose the appearance of the object into “parts” where each visual attribute describes a spatially localized region on the object. - we also decompose some attributes in orientation content. - finally, the spatial relationships of the parts is an important cue for detection. With this representation, each histogram now becomes a joint distribution of attribute and attribute position, P k (pattern k (x,y),x,y | object), and P k (pattern k (x,y),x,y | non-object).

Functional Form of Decision Rule Representation of Visual Attributes by Subsets of Quantized Wavelet Coefficients - the wavelet transform organizes the image into subbands that are localized in orientation and frequency. - within each subband, each coefficient is spatially localized. - use 3 level decomposition wavelet transform, producing 10 subbands.

Functional Form of Decision Rule - each visual attribute is defined to sample a moving window of transform coefficients. for example, 3*3 window of coefficients in level 3 LH band; 2*2 window in the LH and HL bands of the 2 nd level.

Functional Form of Decision Rule - overall, we use 17 attributes that sample the wavelet transform in groups of 8 coefficients in one of the following ways: 1. Intra-subband – All the coefficients come from the same subband. These visual attributes are the most localized in frequency and orientation. 7 attributes for the following subbands: level 1 LL, level 1 LH, level 1 HL, level 2 LH, level 2 HL, level 3 LH, level 3 HL.

Functional Form of Decision Rule 2. Inter-frequency – Coefficients come from the same orientation but multiple frequency bands. these attributes represent visual cues that span a range of frequencies such as edge. 6 attributes using the following subband pairs: level 1 LL – level 1 HL, level 1 LL – level 1 LH, level 1 LH – level 2 LH, level 1 HL – level 2 HL, level 2 LH – level 3 LH, level 2 HL – level 3 HL. 3. Inter-orientation – Coefficients come from the same frequency band but multiple orientation bands. these attributes represent cues that have both horizontal and vertical components such as corners. 3 attributes using the following subband pairs: level 1 LH – level 1 HL, level 2 LH – level 2 HL, level 3 LH – level 3 HL.

Functional Form of Decision Rule 4. Inter-frequency / Inter-orientation – This combination is designed to represent cues that span a range of frequencies and orientations. we define one such attribute combination coefficients from the following subbands: level 1 LL, level 1 LH, level 1 HL, level 2 LH, level 2 HL

Functional Form of Decision Rule Final Form of Detector finally, our approach is to sample each attribute at regular intervals over the full extent of the object, allowing samples to partially overlap. where “region” is the image window we are classifying.

Coarse to Fine Search Strategy a direct implementation of exhaustive search will take a long time. use a heuristic coarse-to-fine strategy to speed up this process. - first partially evaluate the likelihood ratio for each possible object location using low resolution visual attributes in level 1 subband. - then only continue evaluation at higher resolution for those object candidates. - search a 320*240 image over 4 octaves of candidate size takes about 1 minute for faces and 5 minutes for cars.

Accuracy of Face Detection with Out-of-Plane Rotation First successful algorithm for detection of faces with out-of-plane rotation. Test set: 208 images with 441 faces of which 347 are in profile view.

Accuracy of Car Detection First algorithm that can reliably detect passenger cars over a range of viewpoints. Test set: 104 images that contains 213 cars which span a wide variety of models, sizes, orientations, lighting conditions and so on.