Presented by Minh Hoai Nguyen Date: 28 March 2007 Object detection Presented by Minh Hoai Nguyen Date: 28 March 2007
Object detection? Challenges: + Diff locations + Diff Scales + Diff poses, expressions + Diff illuminations, skin color, glasses, occluded, reflection etc.
What we want Miss a face!
Happy face!
Scanning window Train a classifier on a fixed size window Seems to be slow but: + does work + can speed up using some tricks Disadvantage: + No context information. Advantage: + Only need to train classifier on a small, fixed-size window.
Outline Object Detection Using the Statistics of Parts Schneiderman, H. & Kanade, T. CVPR00, IJCCV04 Robust Real-time Face Detection Viola, P. & Jones, M. CVPR01, IJCV04
Bayes optimal classifier Image is defined by n attrs: x1,x2,…,xn There are too many parameters to learn
Naïve Bayes Assumption Assume: x1,x2,…,xn are cond. independent. Easier to learn Problem: this might be a bad assumption Idea: Carefully divide x1,x2,…,xn into groups: P1, P2,…, Pk Assume P1, P2,…, Pk are independent
Independent groups/parts How to divide x1,x2,…,xn into ind. groups? Image pixels are highly correlated. Represent image by Wavelets instead.
10 filter responses for each original pixel. Wavelet transform HL 10 filter responses for each original pixel. HH LH Wavelet transform is fully invertible. Partially de-correlate natural imagery More independence, easier to design parts
Designing parts Assumption: Parts: Each wavelet coefficient only depends on few others. Group those coefficients into parts. Parts: 17 types, manually defined. Each part contains 8 coefficients.
Slide credit: Nicholas Chan Categories of parts Intra-subband Local operator Inter-frequency Local operator “Parts” Inter-orientation Local operator Inter-frequency/ Inter-orientation Local operator Slide credit: Nicholas Chan
How to compute these statistics? Final form of detector How to compute these statistics? Count!
Multiple poses? Other tricks: Not going to talk about.
Reported results for faces Kodak dataset: Test set: 17 images, 46 faces, 36 profile views.
A bigger dataset From multiple sources 208 images, 441 faces, about 347 profiles.
Robust Real-time Face Detection by Viola,P. & Jones, M.
Cascade of classifiers Most places do not have faces!
Simple features Box filters Approximation of Harr-wavelets Integral image Feature evaluation can be done by few lookups
Learning the cascade AdaBoost Weak classifiers are box filters
Learning cascade stages Using AdaBoost to train each stage: Adjust threshold to minimize false negatives. Adding features until target detection and false positive rates are met (determined by CV)
Learned cascade First classifier: 2 features 100% detection 40% false detection The whole cascade: 38 stages 6000 features in total On dataset with 507 faces and 75 millions sub-windows, faces are detected using 10 feature evaluations on average. On average, 10 feature evals/sub-window
Reported ROC curve
Comparison results
The end