Thomas Berg and Peter Belhumeur

Slides:



Advertisements
Similar presentations
Attributes for Classifier Feedback Amar Parkash and Devi Parikh.
Advertisements

CS525: Special Topics in DBs Large-Scale Data Management
Standard Brain Model for Vision
Object Recognition from Local Scale-Invariant Features David G. Lowe Presented by Ashley L. Kapron.
Michele Merler Jacquilene Jacob.  Applications online are inherently insecure  Growing rate of hackers  Confidentiality of online systems should be.
Multi-Attribute Spaces: Calibration for Attribute Fusion and Similarity Search University of Oxford 5 th December 2012 Walter Scheirer, Neeraj Kumar, Peter.
Human Detection Phanindra Varma. Detection -- Overview  Human detection in static images is based on the HOG (Histogram of Oriented Gradients) encoding.
Contributions A people dataset of 8035 images. Three layer attribute classification framework using poselets. 1 2.
Does one size really fit all? Evaluating classifiers in Bag-of-Visual-Words classification Christian Hentschel, Harald Sack Hasso Plattner Institute.
Adaptive Segmentation Based on a Learned Quality Metric
Jan-Michael Frahm, Enrique Dunn Spring 2013
How Do You Tell a Blackbird from a Crow? Thomas Berg and Peter N. Belhumeur Columbia University.
Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Neeraj Kumar, Alexander C. Berg, Peter N. Belumeur, and Shree K. Nayar Presented by Gregory Teodoro.
Ivan Laptev IRISA/INRIA, Rennes, France September 07, 2006 Boosted Histograms for Improved Object Detection.
Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.
Lecture 31: Modern object recognition
Many slides based on P. FelzenszwalbP. Felzenszwalb General object detection with deformable part-based models.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
Yuanlu Xu Human Re-identification: A Survey.
Mixture of trees model: Face Detection, Pose Estimation and Landmark Localization Presenter: Zhang Li.
Intelligent Systems Lab. Recognizing Human actions from Still Images with Latent Poses Authors: Weilong Yang, Yang Wang, and Greg Mori Simon Fraser University,
Tom-vs-Pete Classifiers and Identity- Preserving Alignment for Face Verification Thomas Berg Peter N. Belhumeur Columbia University 1.
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
DISCRIMINATIVE DECORELATION FOR CLUSTERING AND CLASSIFICATION ECCV 12 Bharath Hariharan, Jitandra Malik, and Deva Ramanan.
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Poselets Michael Krainin CSE 590V Oct 18, Person Detection Dalal and Triggs ‘05 – Learn to classify pedestrians vs. background – HOG + linear SVM.
Graz University of Technology, AUSTRIA Institute for Computer Graphics and Vision Fast Visual Object Identification and Categorization Michael Grabner,
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Computer vision.
Marcin Marszałek, Ivan Laptev, Cordelia Schmid Computer Vision and Pattern Recognition, CVPR Actions in Context.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
Discriminative Local Binary Patterns for Human Detection in Personal Album.
Object Recognition in Images Slides originally created by Bernd Heisele.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
MedIX – Summer 07 Lucia Dettori (room 745)
CS 1699: Intro to Computer Vision Support Vector Machines Prof. Adriana Kovashka University of Pittsburgh October 29, 2015.
Improved Object Detection
Timo Ahonen, Abdenour Hadid, and Matti Pietikainen
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko slides adapted from Svetlana Lazebnik.
Week 10 Emily Hand UNR.
Poselets: Body Part Detectors Trained Using 3D Human Pose Annotations ZUO ZHEN 27 SEP 2011.
Computer Vision Spring ,-685 Instructor: S. Narasimhan WH 5409 T-R 10:30am – 11:50am Lecture #23.
Locally Linear Support Vector Machines Ľubor Ladický Philip H.S. Torr.
Presented by David Lee 3/20/2006
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Face detection Many slides adapted from P. Viola.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Cascade for Fast Detection
Presented by David Lee 3/20/2006
Learning Mid-Level Features For Recognition
Lit part of blue dress and shadowed part of white dress are the same color
Object detection as supervised classification
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Project Implementation for ITCS4122
Thesis Advisor : Prof C.V. Jawahar
A Tutorial on HOG Human Detection
Computer Vision James Hays
Attributes and Simile Classifiers for Face Verification
Categorization by Learning and Combing Object Parts
Multiple Feature Learning for Action Classification
Fine-Grained Visual Categorization
AHED Automatic Human Emotion Detection
The “Margaret Thatcher Illusion”, by Peter Thompson
Presentation transcript:

Thomas Berg and Peter Belhumeur “POOF: Part Based One-vs-One Features for Fine Grained Categorization, Face Verification, and Attribute Estimation” Thomas Berg and Peter Belhumeur CVPR 2013 VGG Reading Group 4.7.2013 Eric Sommerlade

Summary A POOF is a scalar defined Perks: for a discriminative region between two classes and two landmarks for a set of base features (e.g. HOG or colour hist.) Perks: Regions automatically learned from data set Great Performance transfers in knowledge from external datasets

Motivation: Standard approach to part based recognition: - extract standard feature (SIFT, HOG, LBP) - train classifier - relevant regions tuned by hand Idea: “standard” features hardly optimal for specific problem “best” according to - domain (dog features != bird features) - task (face recognition != gender classification)

POOF feature learning: From dataset with landmark annotations

POOF feature learning: Choose feature part f Choose alignment part a Align and crop to 128x64 region Larger/shorter distance -> coarser/finer scale

POOF feature learning: Scales: 8x8 and 16x16  8*16 + 4*8 = 160 cells

POOF feature learning: Per cell: 8 bin gradient direction histogram Dg=8 (‘gradhist’) Or Felsenszwalb HOG: Dg=31 Color histogram Dc=32 Concatenated length (Dg+Dc)*160

POOF feature learning: For each scale (8x8, 16x16): learn linear SVM, get weights w Keep max abs(w_i) per cell Keep cells with max(w_c)>=median(max(w_c)) keep connected component (4?) starting at f W: c1 c2 cn … c1 c2 cn max: … c1 c2 cn threshold: …

POOF feature learning: retrain SVM on selected cells only Get POOF (bitmap+svm weight vector):

POOF feature extraction: Find corresponding landmarks Authors use Belhumeur CVPR 2011 Align & crop to 128x64 region Get base features Get SVM score from features in masked region

Results: categorization UCSD birds dataset, 200 classes 13 landmarks used About 5m POOF combinations possible Randomly chosen subset of 5000 POOFs Use as feature vector in one-vs-all linear SVM Evaluation on gt bbox of object gt landmarks or detected landmarks

Results: categorization

Results: categorization   gradhist HOG lowlevel baseline [27] [4] (MKL) [33] (RF) [32] [8] [35] 200det 54 56 28 14det 65 70 57 200gt 69 73 40 17 19 14gt 80 85 44 5det 55

Results: Face Verification Are two images of the same person? LFW dataset 16 landmarks 120 subjects ~3.5m POOF choices Each image yields 10000 random POOFs f(I) For image pair concat [|f(I)-f(J)| f(I).*f(J)] Train same-vs-different classifier

Results: Face Verification

Results: Face Verification Performance equal to Tom-vs-Pete (bmvc2012) But: Support regions learned automatically Linear SVM, not RBF  faster Uses same “identity preserving alignment” on landmark detections [2] input affine canonical Mean of all closest in dataset

Results: Attribute classification Attributes such as gender, “big nose”, “eyeglasses” (Kumar [14]) POOFs learned as before, on LFW dataset Extracts POOFs from attribute dataset Train linear SVM for each attribute POOFs transfer discriminability from different classes  no need for fully labelled attribute dataset

Results: Attribute classification Restricted number of attribute samples POOF features don’t latch on to noise …