EE462 MLCV Face Detection Demo 2 Robust real-time object detector, Viola and Jones, CVPR 01 Implemented by Intel OpenCV
EE462 MLCV 3 Multiclass object detection [Torralba et al PAMI 07] A boosting algorithm, originally for binary class problems, has been extended to multi-class problems.
EE462 MLCV Object Detection 4 Input is a single image, given without any prior knowledge.
EE462 MLCV Object Detection 5 Output is a set of tight bounding boxes (positions and scales) of instances of a target object class (e.g. pedestrian).
EE462 MLCV 6 Object Detection We scan every scale and pixel location of an image.
EE462 MLCV Number of windows 7 Number of Windows: 747,666 … x # of scales # of pixels It ends up with a huge number of candidate sub-windows.
EE462 MLCV Time per window 8 or raw pixels …… dimension D Num of feature vectors: 747,666 … Classification: What amount of time are we given to process a single scanning window? SIFT
EE462 MLCV Time per window 9 or raw pixels …… dimension D Num of feature vectors: 747,666 … Time per window (or vector): 0.00000134 sec In order to finish the task in 1 sec Neural Network? Nonlinear SVM?
EE462 MLCV Examples of face detection 10 From Viola, Jones, 2001
EE462 MLCV By Integrating Visual Cues [Darrell et al IJCV 00]. Face pattern detection output (left). Connected components recovered from stereo range data (mid). Flesh hue regions from skin hue classification (right). 11 More traditionally… The search space is narrowed down
EE462 MLCV Since about 2001 (Viola &Jones 01)… “ Boosting Simple Features” has been a dominating art. Adaboost classification Weak classifiers: Haar-basis like functions (45,396 (>>T) in total feature pool) 12 Weak classifier Strong classifier
EE462 MLCV Existence of weak learners Definition of a baseline learner Data weights: Set Baseline classifier: for all x Error is at most ½. Each weak learner in Boosting is demanded s.t. → Error of the composite hypothesis goes to zero as boosting rounds increase [Duffy et al 00]. 28
EE462 MLCV Boosting Simple Features [Viola and Jones CVPR 01] Adaboost classification Weak classifiers: Haar-basis like functions (45,396 in total feature pool) 30 Weak classifier Strong classifier 20 ……
EE462 MLCV 31 Learning (concept illustration) Face images Non-face images Resize to 20x20 D=400 …… weaklearners Output:
EE462 MLCV Evaluation (testing) 32 From Viola, Jones, 2001 For given Non-local maxima suppression we apply the boosting classifier to every scan-window. Non-local maxima supression is performed.
EE462 MLCV How to accelerate boosting training and evaluation
EE462 MLCV Integral Image A value at (x,y) is the sum of the pixel values above and to the left of (x,y). The integral image can be computed in one pass over the original image. 34
EE462 MLCV Boosting Simple Features [Viola and Jones CVPR 01] Integral image The sum of original image values within the rectangle can be computed: Sum = A-B-C+D This provides the fast evaluation of Haar-basis like features 35
EE462 MLCV Evaluation (testing) 36 From Viola, Jones, 2001 x y ii(x,y)
EE462 MLCV Boosting as a Tree-structured Classifier
EE462 MLCV Boosting (very shallow network) The strong classifier H as boosted decision stumps has a flat structure Cf. Decision “ferns” has been shown to outperform “trees” [Zisserman et al, 07] [Fua et al, 07] 38 c0 c1 x ……
EE462 MLCV Boosting -continued Good generalisation is achieved by a flat structure. It provides fast evaluation. It does sequential optimisation. 39 A strong boosting classifier Boosting Cascade [viola & Jones 04], Boosting chain [Xiao et al] It is very imbalanced tree structured. It speeds up evaluation by rejecting easy negative samples at early stages. It is hard to design A strong boosting classifier T = 2 5 10 20 50 100 ……
EE462 MLCV A cascade of classifiers The detection system requires good detection rate and extremely low false positive rates. False positive rate and detection rate are f_i is the false positive rate of i-th classifier on the examples that get through to it. The expected number of features evaluated is p_j is the proportion of windows input to i-th classifier. 40