Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short.

Slides:



Advertisements
Similar presentations
Ensemble Learning Reading: R. Schapire, A brief introduction to boosting.
Advertisements

EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Rapid Object Detection using a Boosted Cascade of Simple Features Paul Viola, Michael Jones Conference on Computer Vision and Pattern Recognition 2001.
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.
CMPUT 466/551 Principal Source: CMU
AdaBoost & Its Applications
Face detection Many slides adapted from P. Viola.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
1 Fast Asymmetric Learning for Cascade Face Detection Jiaxin Wu, and Charles Brubaker IEEE PAMI, 2008 Chun-Hao Chang 張峻豪 2009/12/01.
Viola/Jones: features “Rectangle filters” Differences between sums of pixels in adjacent rectangles { y t (x) = +1 if h t (x) >  t -1 otherwise Unique.
The Viola/Jones Face Detector Prepared with figures taken from “Robust real-time object detection” CRL 2001/01, February 2001.
The Viola/Jones Face Detector (2001)
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
Model Evaluation Metrics for Performance Evaluation
Sparse vs. Ensemble Approaches to Supervised Learning
Rapid Object Detection using a Boosted Cascade of Simple Features
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
Reduced Support Vector Machine
Face detection and recognition Many slides adapted from K. Grauman and D. Lowe.
Ensemble Learning: An Introduction
Adaboost and its application
CSSE463: Image Recognition Day 31 Due tomorrow night – Project plan Due tomorrow night – Project plan Evidence that you’ve tried something and what specifically.
Robust Real-Time Object Detection Paul Viola & Michael Jones.
Viola and Jones Object Detector Ruxandra Paun EE/CS/CNS Presentation
Dynamic Cascades for Face Detection 第三組 馮堃齊、莊以暘. 2009/01/072 Outline Introduction Dynamic Cascade Boosting with a Bayesian Stump Experiments Conclusion.
Foundations of Computer Vision Rapid object / face detection using a Boosted Cascade of Simple features Presented by Christos Stoilas Rapid object / face.
Face Detection CSE 576. Face detection State-of-the-art face detection demo (Courtesy Boris Babenko)Boris Babenko.
AdaBoost Robert E. Schapire (Princeton University) Yoav Freund (University of California at San Diego) Presented by Zhi-Hua Zhou (Nanjing University)
Face Detection using the Viola-Jones Method
CS 231A Section 1: Linear Algebra & Probability Review
Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:
CSSE463: Image Recognition Day 27 This week This week Last night: k-means lab due. Last night: k-means lab due. Today: Classification by “boosting” Today:
Object Detection Using the Statistics of Parts Presented by Nicholas Chan – Advanced Perception Robust Real-time Object Detection Henry Schneiderman.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Face detection Slides adapted Grauman & Liebe’s tutorial
Visual Object Recognition
Robust Real-time Face Detection by Paul Viola and Michael Jones, 2002 Presentation by Kostantina Palla & Alfredo Kalaitzis School of Informatics University.
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Tony Jebara, Columbia University Advanced Machine Learning & Perception Instructor: Tony Jebara.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Adaboost and Object Detection Xu and Arun. Principle of Adaboost Three cobblers with their wits combined equal Zhuge Liang the master mind. Failure is.
Lecture 6: Classification – Boosting and SVMs CAP 5415 Fall 2006.
Lecture 09 03/01/2012 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
Bibek Jang Karki. Outline Integral Image Representation of image in summation format AdaBoost Ranking of features Combining best features to form strong.
CSSE463: Image Recognition Day 33 This week This week Today: Classification by “boosting” Today: Classification by “boosting” Yoav Freund and Robert Schapire.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
FACE DETECTION : AMIT BHAMARE. WHAT IS FACE DETECTION ? Face detection is computer based technology which detect the face in digital image. Trivial task.
Classification Ensemble Methods 1
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko slides adapted from Svetlana Lazebnik.
A Brief Introduction on Face Detection Mei-Chen Yeh 04/06/2010 P. Viola and M. J. Jones, Robust Real-Time Face Detection, IJCV 2004.
SVMs, Part 2 Summary of SVM algorithm Examples of “custom” kernels Standardizing data for SVMs Soft-margin SVMs.
Evaluating Classifiers. Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website)
Face detection Many slides adapted from P. Viola.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
AdaBoost Algorithm and its Application on Object Detection Fayin Li.
Recognition Part II: Face Detection via AdaBoost Linda Shapiro CSE
1 Munther Abualkibash University of Bridgeport, CT.
Reading: R. Schapire, A brief introduction to boosting
Evaluating Classifiers
Session 7: Face Detection (cont.)
Classification with Perceptrons Reading:
High-Level Vision Face Detection.
Implementing AdaBoost
Adaboost for faces. Material
Presentation transcript:

Notes on HW 1 grading I gave full credit as long as you gave a description, confusion matrix, and working code Many people’s descriptions were quite short and lacking information about the experiment – I did not take off for this, but I will be more specific next time. I took off points for non-working code, incorrect confusion matrix, and/or incorrect definition of accuracy

Homework 3: SVMs and Feature Selection Data: – –Create a subset of the data that has equal numbers of positive and negative examples. –Put data into format needed for the SVM package you’re using –Split data into ~ ½ training, ½ test (each should have equal numbers of positive and negative examples) –Scale training data using standardization (as in HW 2) –Scale test data using standardization parameters from training data –Randomly shuffle training data –Randomly shuffle test data. Use only for testing at end of cross-validation or feature selection

Experiment 1: Cross-validation using linear SVM to find best “C” parameter value –Use SVM light ( or any SVM package –Use linear kernel (default in SVM light ) –Use 10-fold cross-validation to test values of C parameter in {0,.1,.2,.3,.4,.5,.6,.7,.8,.9, 1} Split training set into 10 approx equal-sized disjoint sets S i For each value of the C parameter j : –For i = 1 to 10 Select S i to be the “validation” set Train linear SVM using C=j and all training data except S i. Test learned model on S i to get accuracy A j,i –Compute average accuracy for C=j: Choose value C= C* that results in highest A j –Train new linear SVM using all training data with C=C* –Test learned SVM model on test data. Report accuracy, precision, and recall (using threshold 0 to determine positive and negative classifications) –Use results on test data to create an ROC curve for this SVM, using about 200 evenly spaced thresholds.

Experiment 2: Feature selection with linear SVM –Using final SVM model from Experiment 1: Obtain weight vector w. (For SVM light, see Select features: –For m = 1 to 57 Select the set of m features that have highest |w m | Train a linear SVM, SVM m, on all the training data, only using these m features, and using C* from Experiment 1 Test SVM m on the test set to obtain accuracy. –Plot accuracy vs. m Experiment 3: Random feature selection (baseline) –Same as Experiment 2, but for each m, select m features at random from the complete set. This is to see if using SVM weights for feature selection has any advantage over random.

What to include in your report 1.Experiment 1: –Which SVM package you used –The C* (best value of C) you obtained –Accuracy, Precision, and Recall on the test data, using final learned model –ROC Curve 2.Experiment 2: –Plot of accuracy (on test data) vs. m (number of features) –Discussion of what the top 5 features were (see databases/spambase/spambase.names for details of features) –Discussion of the effects of feature selection (about 1 paragraph). Experiment 3: –Plot of accuracy (on test data) vs. m (number of features) –Discussion of results of random feature selection vs. SVM-weighted feature selection (about 1 paragraph).

What code to turn in Your code / script for performing cross-validation in Experiment 1 Your code for creating an ROC curve in Experiment 1 Your code / script for performing SVM-weighted feature selection in Experiment 2 Your code / script for performing random feature selection in Experiment 3

SVMs, Feature Selection, ROC curves Demo

Quiz on Thursday Feb. 5 Topics: Feature selection, ensemble learning You will need a calculator! Quiz review

Adaboost, Part 2

Given S = {(x 1, y 1 ),..., (x N, y N )} where x  X, y i  {+1, −1} Initialize w 1 (i) = 1/N. (Uniform distribution over data) For t = 1,..., K: 1.Select new training set S t from S with replacement, according to w t 1.Train L on S t to obtain hypothesis h t 1.Compute the training error  t of h t on S : If ε t > 0.5, abandon h t and go to step 1 Recap of Adaboost algorithm

–Compute coefficient: –Compute new weights on data: For i = 1 to N where Z t is a normalization factor chosen so that w t+1 will be a probability distribution: At the end of K iterations of this algorithm, we have h 1, h 2,..., h K, and  1,  2,...,  K Ensemble classifier:

In-class exercises

Case Study of Adaboost #1: Viola-Jones Face Detection Algorithm P. Viola and M. J. Jones, Robust real-time face detection. International Journal of Computer Vision, First face-detection algorithm to work well in real-time (e.g., on digital cameras)

Training Data Positive: Faces scaled and aligned to a base resolution of 24 by 24 pixels. Negative: Much larger number of non-faces.

Features From Use rectangle features at multiple sizes and location in an image subwindow (candidate face). For each feature f j : Possible number of features per 24 x 24 pixel subwindow > 180,000. Need feature selection!

Detecting faces Given a new image: Scan image using subwindows at all locations and at different scales For each subwindow, compute features and send them to an ensemble classifier (learned via boosting). If classifier is positive (“face”), then detect a face at this location and scale.

Preprocessing: Viola & Jones use a clever pre-processing step that allows the rectangular features to be computed very quickly. (See their paper for description. ) They use a variant of AdaBoost to both select a small set of features and train the classifier.

Base (“weak”) classifiers For each feature f j, where x is a 24 x 24-pixel subwindow of an image, θ j is the threshold that best separates the data using feature f j, and p j is either -1 or 1. Such features are called “decision stumps”.

Boosting algorithm

Note that only the top T features are used.

Viola-Jones “attentional cascade” algorithm Increases performance while reducing computation time Main idea: –Vast majority of sub-windows are negative. –Simple classifiers used to reject majority of sub-windows before more complex classifiers are applied –See their paper for details

From Wikipedia

Problems with Boosting Sensitive to noise, outliers: Boosting focuses on those examples that are hard to classify But this can be a way to identify outliers or noise in a dataset