Learning Semantics with Less Supervision

Name: Learning Semantics with Less Supervision
Uploaded: 2017-12-09T18:12:02+00:00
Duration: PTM12S22
Channel: Christopher Mansell
Description: Learning Semantics with Less Supervision

Learning Semantics with Less Supervision

Agenda Beyond Fixed Keypoints Beyond Keypoints Open discussion

Part Discovery from Partial Correspondence
[Subhransu Maji and Gregory Shakhnarovich, CVPR 2013]

Keypoints in diverse categories
Where are the keypoints? Can you name them?

Does the name of a keypoint matter?
We can mark correspondences without naming parts Maji and Shakhnarovich HCOMP’12

Annotation interface on MTurk
Example landmarks are provided:

Example annotations Annotators mark 5 landmark pairs on average

Are the landmarks consistent across annotators?
Yes

Semantic part discovery
Given a window in the first image we can find the corresponding window in the second image

propagate correspondence in the “semantic graph”

Iter 2 Iter 1 Iter 0 Discover parts using breadth-first traversal

The semantic graph alone is not good enough
Graph only Graph + appearance Trained using latent LDA scale, translation, membership

Graph only Graph + Appearance

Examples of learned parts

Part-based representation
image other activations on the training set

Detecting church buildings: individual parts
graph mining better seeds

Detecting church buildings: collection of parts
Detection is challenging due to structural variability Latent LDA parts + voting AP=39.9%, DPM AP=34.7%

Label Transfer Ask users to label parts where it makes sense: -> arch -> tower -> window Transfer labels on test images:

Agenda Beyond Fixed Keypoints Beyond Keypoints Open Discussion

Unsupervised Discovery of Mid-Level Discriminative Patches
Sarubh Singh, Abhinav Gupta and Alexei Efros, ECCV12

Can we get nice parts without supervision?
Idea 0: K-means clustering in HOG space

Still not good enough The SVM memorizes bad examples and still scores them highly However, the space of bad examples is much more diverse So we can avoid overfitting if we train on a training subset but look for patches on a validation subset

Why K-means on HOG fails?
Chicken & Egg Problem If we know that a set of patches are visually similar we can easily learn a distance metric for them If we know the distance metric, we can easily find other members

Idea 1: Discriminative Clustering
Start with K-Means Train a discriminative classifier for the distance function, using all other classes as negative examples Re-assign patches to clusters whose classifier gives highest score Repeat

Idea 2: Discriminative Clustering+
Start with K-Means or kNN Train a discriminative classifier for the distance function, using Detection Detect the patches and assign to top k clusters Repeat

Can we get good parts without supervision?
What makes a good part? Must occur frequently in one class (representative) Must not occur frequently in all classes (discriminative)

Discriminative Clustering+

Idea 3: Discriminative Clustering++
Split the discovery dataset into two equal parts (training and validation) Train on the training subset Run the trained classifier on the validation set to collect examples Exchange training and validation sets Repeat

Discriminative Clustering++

Doublets: Discover second-order relationships
Start with high-scoring patches Find spatial correlations to other (weaker patches) Rank the potential doublets on validation set

Doublets

AP on MIT Indoor-67 scene recognition dataset

Blocks that shout: Distinctive Parts for Scene Classification
Juneja, Vedaldi, Jawahar and Zisserman, CVPR13 bookstore buffet computer room closet

Three steps Seeding (proposing initial parts)
Expansion (learning part detectors) Selection (identifying good parts)

Step 1: Seeding Segment the image
Find proposal regions based on “objectness” Compute HOG features for each

Step 2: Expansion Train Exemplar SVM for each seed region [Malisiewitz et al] Apply it on validation set to collect more examples Retrain and repeat

Step 3: Selection Good parts should occur frequently in small number of classes but infrequently in the rest Collect top 5 parts from each validation image, sort occurrences of each part by score and keep the top r Compute the entropy for each part over the class distribution. Retain lowest-entropy parts Filter out any parts too similar to others (based on cosine similarity of their SVM weights)

Features and learning Features: Explored Dense RootSIFT, BoW, LLS, Improved Fisher Vectors Non-linear SVM (sqrt kernel)

Results on MIT Indoor-67 Singh et al Juneja et al Seeding
K-means on HOG Exemplar SVM Feature space HOG IFV SVM Linear Non-linear Selection Purity & discriminativeness (penalizes parts that perform well for multiple clusters) Entropy rank (allows for parts that work for multiple clusters) AP on MIT 67 49.4 61.1

Learning Collections of Parts for Object Recognition
[Endres, Shih, Jiaa and Hoiem, CVPR13]

Overview of the method Seeding: Random samples including full bounding box and sub-window boxes Expanding: Exemplar SVM, fast training (using LDA) Selection: Greedy method, pick parts that require each training example to be explained by a part Appearance Consistency: Include parts that have high SVM score Spatial Consistency: Prefer parts that come from the same location within bounding box Training and Detection: Boosting over Category Independent Object Proposals [Endres & Hoiem]

Results on PASCAL 2010 detection
Averages of patches on the top 15 detections on the validation set for a set of parts

Agenda Beyond Fixed Keypoints Beyond Keypoints Open Discussion

Gender Recognition on Labeled Faces in the Wild
Much easier dataset – no occlusion, high resolution, centered frontal faces Method Gender AP Kumar et al, ICCV 2009 95.52 Frontal Face poselet 96.43 [Zhang et al, arXiv: ]

Gender Recognition on Labeled Faces in the Wild
Much easier dataset – no occlusion, high resolution, centered frontal faces Method Gender AP Kumar et al, ICCV 2009 95.52 Frontal Face poselet 96.43 Poselets + Deep Learning 99.54 Male of female? [Zhang et al, arXiv: ]

Poselets vs DPMs vs Discriminative Patches
Approach Parametric Non-parametric Speed Faster (fewer types) Slower Slower (many types) Redundancy Little A lot (improves accuracy) A lot Spatial model Sophisticated Primitive (threshold) Primitive Supervision requirements Needs 2 keypoints Needs more keypoints (10+) No supervision Uses multi-scale signal? Two scale levels Yes, multiple scales yes Jointly trained Yes No Attached semantics Medium

Supervision in parts DISCRIMINATIVE PATCHES ISM DPMs SIFT POSELETS
unsupervised strongly supervised

Questions for open discussion
What is the future for mid-level parts? More supervision vs less supervision? Should low-level parts be hard-coded or jointly trained? Parametric vs non-parametric approaches? Parts with/without associated semantics

Learning Semantics with Less Supervision

Similar presentations

Presentation on theme: "Learning Semantics with Less Supervision"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning Semantics with Less Supervision

Similar presentations

Presentation on theme: "Learning Semantics with Less Supervision"— Presentation transcript:

Similar presentations

About project

Feedback