C ONSTRAINED S EMI -S UPERVISED L EARNING USING A TTRIBUTES AND C OMPARATIVE A TTRIBUTES Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta The Robotics.

Slides:



Advertisements
Similar presentations
Indoor Segmentation and Support Inference from RGBD Images Nathan Silberman, Derek Hoiem, Pushmeet Kohli, Rob Fergus.
Advertisements

Attributes for Classifier Feedback Amar Parkash and Devi Parikh.
The Layout Consistent Random Field for detecting and segmenting occluded objects CVPR, June 2006 John Winn Jamie Shotton.
3rd Workshop On Semantic Perception, Mapping and Exploration (SPME) Karlsruhe, Germany,2013 Semantic Parsing for Priming Object Detection in RGB-D Scenes.
Exploiting Big Data via Attributes (Offline Contd.)
Adding Unlabeled Samples to Categories by Learned Attributes Jonghyun Choi Mohammad Rastegari Ali Farhadi Larry S. Davis PPT Modified By Elliot Crowley.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
A generic model to compose vision modules for holistic scene understanding Adarsh Kowdle *, Congcong Li *, Ashutosh Saxena, and Tsuhan Chen Cornell University,
Shape Sharing for Object Segmentation
Scene Labeling Using Beam Search Under Mutex Constraints ID: O-2B-6 Anirban Roy and Sinisa Todorovic Oregon State University 1.
SPONSORED BY SA2014.SIGGRAPH.ORG Annotating RGBD Images of Indoor Scenes Yu-Shiang Wong and Hung-Kuo Chu National Tsing Hua University CGV LAB.
Large-Scale Object Recognition using Label Relation Graphs Jia Deng 1,2, Nan Ding 2, Yangqing Jia 2, Andrea Frome 2, Kevin Murphy 2, Samy Bengio 2, Yuan.
Capturing Human Insight for Visual Learning Kristen Grauman Department of Computer Science University of Texas at Austin Work with Sudheendra Vijayanarasimhan,
Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Discriminative and generative methods for bags of features
Recognition: A machine learning approach
Event prediction CS 590v. Applications Video search Surveillance – Detecting suspicious activities – Illegally parked cars – Abandoned bags Intelligent.
Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Robust Higher Order Potentials For Enforcing Label Consistency
Fast and Compact Retrieval Methods in Computer Vision Part II A. Torralba, R. Fergus and Y. Weiss. Small Codes and Large Image Databases for Recognition.
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
LARGE-SCALE NONPARAMETRIC IMAGE PARSING Joseph Tighe and Svetlana Lazebnik University of North Carolina at Chapel Hill CVPR 2011Workshop on Large-Scale.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)
Hierarchical Subquery Evaluation for Active Learning on a Graph Oisin Mac Aodha, Neill Campbell, Jan Kautz, Gabriel Brostow CVPR 2014 University College.
Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.
What, Where & How Many? Combining Object Detectors and CRFs
Discriminative and generative methods for bags of features
Beyond Nouns Abhinav Gupta and Larry S. Davis University of Maryland, College Park Exploiting Prepositions and Comparative Adjectives for Learning Visual.
Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.
Describing People: A Poselet-Based Approach to Attribute Classification Lubomir Bourdev 1,2 Subhransu Maji 1 Jitendra Malik 1 1 EECS U.C. Berkeley 2 Adobe.
Unsupervised Learning of Categories from Sets of Partially Matching Image Features Kristen Grauman and Trevor Darrel CVPR 2006 Presented By Sovan Biswas.
Step 3: Classification Learn a decision rule (classifier) assigning bag-of-features representations of images to different classes Decision boundary Zebra.
Internet-scale Imagery for Graphics and Vision James Hays cs195g Computational Photography Brown University, Spring 2010.
Object Detection Sliding Window Based Approach Context Helps
Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
Enhancing Human-Machine Communication via Visual Attributes Devi Parikh Virginia Tech.
Associative Hierarchical CRFs for Object Class Image Segmentation Ľubor Ladický 1 1 Oxford Brookes University 2 Microsoft Research Cambridge Based on the.
UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.
Modeling the Shape of a Scene: Seeing the trees as a forest Scene Understanding Seminar
Object-Graphs for Context-Aware Category Discovery Yong Jae Lee and Kristen Grauman University of Texas at Austin 1.
Context Neelima Chavali ECE /21/2013. Roadmap Introduction Paper1 – Motivation – Problem statement – Approach – Experiments & Results Paper 2 Experiments.
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Classifying Covert Photographs CVPR 2012 POSTER. Outline  Introduction  Combine Image Features and Attributes  Experiment  Conclusion.
SUN Database: Large-scale Scene Recognition from Abbey to Zoo Jianxiong Xiao *James Haysy Krista A. Ehinger Aude Oliva Antonio Torralba Massachusetts Institute.
Object Recognition by Discriminative Combinations of Line Segments and Ellipses Alex Chia ^˚ Susanto Rahardja ^ Deepu Rajan ˚ Maylor Leung ˚ ^ Institute.
Sung Ju Hwang and Kristen Grauman University of Texas at Austin.
Using the Forest to see the Trees: A computational model relating features, objects and scenes Antonio Torralba CSAIL-MIT Joint work with Aude Oliva, Kevin.
Max-Margin Training of Upstream Scene Understanding Models Jun Zhu Carnegie Mellon University Joint work with Li-Jia Li *, Li Fei-Fei *, and Eric P. Xing.
IEEE 2015 Conference on Computer Vision and Pattern Recognition Active Learning for Structured Probabilistic Models with Histogram Approximation Qing SunAnkit.
The SUN Database Slides by Jennifer Baulier. What is the SUN database? Scene Understanding Database Scene = a place humans could act within Full database.
C ONSTRAINED S EMI -S UPERVISED L EARNING USING A TTRIBUTES AND C OMPARATIVE A TTRIBUTES Presenter : Ankit Laddha Most of the slides are borrowed from.
NEIL: Extracting Visual Knowledge from Web Data Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta Carnegie Mellon University CS381V Visual Recognition -
CS598:V ISUAL INFORMATION R ETRIEVAL Lecture IV: Image Representation: Semantic and Attribute based Representation.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Data Driven Attributes for Action Detection
Part-Based Room Categorization for Household Service Robots
Digit Recognition using SVMS
Object-Graphs for Context-Aware Category Discovery
CS 1674: Intro to Computer Vision Scene Recognition
Rob Fergus Computer Vision
Video understanding using part based object detection models
Adarsh Kowdle*, Congcong Li*, Ashutosh Saxena, and Tsuhan Chen
Zeroshot Learning Mun Jonghwan.
Presentation transcript:

C ONSTRAINED S EMI -S UPERVISED L EARNING USING A TTRIBUTES AND C OMPARATIVE A TTRIBUTES Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta The Robotics Institute Carnegie Mellon University

S UPERVISION S UPERVISED A CTIVE L EARNING B IG -D ATA 2 [von Ahn and Dabbish, 2004], [Russell et al., IJCV 2009] [Prakash and Parikh, ECCV 2012] [Vijayanarasimhan and Grauman, CVPR 2011] [Kapoor et al., ICCV 2007] [Qi et al., CVPR 2008] [Joshi et al., CVPR 2009] [Siddiquie and Gupta, CVPR 2010]

S UPERVISION S UPERVISED U NSUPERVISED A CTIVE L EARNING 3 [Russell et al., CVPR 2006] [Kang et al., ICCV 2011]

S UPERVISION S UPERVISED U NSUPERVISED A CTIVE L EARNING S EMI -S UPERVISED 4 [Zhu, TR, 2005], [Chunsheng Fang, Slides, 2009]

S UPERVISION S UPERVISED U NSUPERVISED A CTIVE L EARNING S EMI -S UPERVISED B OOTSTRAPPING Labeled Seed Examples Amphitheatre Unlabeled Data Select Candidates Train Models Add to Labeled Set Retrain Models Amphitheatre 5

S UPERVISION S UPERVISED U NSUPERVISED A CTIVE L EARNING S EMI -S UPERVISED B OOTSTRAPPING Retrain Models Labeled Seed Examples Amphitheatre Unlabeled Data Select Candidates Add to Labeled Set Amphitheatre 25 th Iteration 6 [Curran et al., PACL 2007] Semantic Drift Amphitheatre + Auditorium

S UPERVISION S UPERVISED U NSUPERVISED A CTIVE L EARNING S EMI -S UPERVISED G RAPH - BASED M ETHODS 7 [Ebert et al., ECCV 2010] [Fergus et al., NIPS 2009]

Amphitheatre O UR APPROACH Amphitheatre Auditorium Amphitheatre Auditorium 8

O UR APPROACH Amphitheatre Auditorium Amphitheatre Auditorium 9 Joint Learning [Carlson et al., NAACL HLT Workshop on SSL for NLP 2009] Share Data

Amphitheatre Auditorium Banquet Hall Banquet Hall Conference Room Conference Room B INARY A TTRIBUTES (BA) 10 [Farhadi et al., CVPR 2009] [Lampert et al., CVPR 2009]

B INARY A TTRIBUTES (BA) Tables and Chairs Conference Room Banquet Hall Auditorium Amphitheatre Indoor Large Seating Capacity Man-made [Patterson and Hays, CVPR 2012] Tables and Chairs Conference Room Banquet Hall Auditorium Amphitheatre Indoor Large Seating Capacity Man-made

Auditorium 12

S HARING VIA D ISSIMILARITY 13 AmphitheatreAuditorium Has Larger Circular Structures [Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008]

AmphitheatreAuditorium ? 14

15 AmphitheatreAuditorium

D ISSIMILARITY 16 Has Larger Circular Structures [Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008] C OMPARATIVE A TTRIBUTES

D ISSIMILARITY 17 C OMPARATIVE A TTRIBUTES Has Larger Circular Structures [Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008] ………… Features GIST RGB (Tiny Image) Line Histogram of:  Length  Orientation LAB histogram [Hays and Efros, SIGGRAPH 2007] [Oliva and Torralba, 2006] [Torralba et al., PAMI, 2008]

………… D ISSIMILARITY 18 C OMPARATIVE A TTRIBUTES [Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008] ………… Has Larger Circular Structures Classifier Boosted Decision Tree [Hoiem et al., IJCV 2007] or Has Larger Circular Structures

C OMPARATIVE A TTRIBUTES 19 [Parikh and Grauman, ICCV 2011] [Gupta and Davis, ECCV 2008] Amphitheatre>Barn Amphitheatre>Conference Room Desert>Barn Is More Open Church (Outdoor)>Cemetery Barn>Cemetery Has Taller Structures

Amphitheatre Auditorium Amphitheatre Auditorium Labeled Seed Examples Bootstrapping Selected Candidates 20

Labeled Seed Examples Amphitheatre Auditorium Amphitheatre Auditorium Bootstrapping Amphitheatre Auditorium Our Approach (Constrained Bootstrapping) Selected Candidates Indoor Has Seat Rows Attributes Has Larger Circular Structures Comparative Attributes 21

Banquet Bedroom Labeled Data Unlabeled Data has more space has larger structures Training Pairwise Data Promoted Instances Conference Room Banquet Hall 22 [Gupta and Davis, ECCV 2008] Comparative Attribute Classifiers more space larger structures Attribute Classifiers indoor has grass Scene Classifiers bedroom banquet hall

Conference Room Iteration 1 Iteration 40 Seed Examples 23 Introspection

Bootstrapping BA Constraints Amphitheatre Our Approach Seed Images 24

BA Constraints Bridge Seed Images Bootstrapping Our Approach 25

E XPERIMENTAL E VALUATION 26

C ONTROL E XPERIMENTS 27 # Images (SUN Database) 15 Scene Classes2 (seed) Black-box Binary Attributes (BA) 25 (separate) Black-box Comparative Attributes (CA) 25 (separate) Unlabeled Dataset18,000 (Distractors)(9,500) Test Set50 SUN Database : [Xiao et al., CVPR 2010]

C ONTROL E XPERIMENTS 28

C ONTROL E XPERIMENTS 29

C ONTROL E XPERIMENTS 30

C ONTROL E XPERIMENTS 31

C ONTROL E XPERIMENTS 32 \\

C ONTROL E XPERIMENTS 33 \\ Eigen Functions: [Fergus et al., NIPS 2009]

1 40 Banquet Hall 10 Iterations Seed Images 34

C ONTROL E XPERIMENTS 35 # Images (SUN Database) 15 Scene Classes2 (seed) Black-box Binary Attributes (BA) 25 (separate) Black-box Comparative Attributes (CA) 25 (separate) Unlabeled Dataset18,000 (Distractors)(9,500) Test Set50 SUN Database : [Xiao et al., CVPR 2010]

C O - TRAINING (S MALL S CALE ) 36 # Images (SUN Database) 15 Scene Classes2 (seed) Black-box Binary Attributes (BA) 15x2 (seed) Black-box Comparative Attributes (CA) 15x2 (seed) Unlabeled Dataset18,000 (Distractors)(9,500) Test Set50 SUN Database : [Xiao et al., CVPR 2010]

Iteration-1 Iteration-60 Bootstrapping Our Approach Iteration-1 Iteration-60 Seed Images Bedroom 37

S CENE C LASSIFICATION 38 Eigen Functions: [Fergus et al., NIPS 2009]

C O - TRAINING ( LARGE S CALE ) 15 Scene Categories  25 Seed images / category Unlabeled Set  1Million (SUN Database + ImageNet)  >95% distractors 39 SUN Database: [Xiao et al., CVPR 2010] ImageNet: [Deng et al., CVPR 2009] Improve 12 out of 15 scene classifiers

C ONCLUSION Sharing via Dissimilarities Constrained Bootstrapping 40 AuditoriumAmphitheatre Has Larger Circular Structures Labeled Seed Examples Amphitheatre Auditorium Amphitheatre Auditorium Bootstrapping Amphitheatre Auditorium Constrained Bootstrapping

41 Banquet Bedroom Labeled Data Unlabeled Data has more space has larger structures Training Pairwise Data Promoted Instances Conference Room Banquet Hall Comparative Attribute Classifiers more space larger structures Attribute Classifiers indoor has grass Scene Classifiers bedroom banquet hall

42

43 Unary Binary

E IGEN F UNCTIONS 44

F EATURES 960D GIST 75D RGB (image is resized to 5x5) 30D histogram of line lengths 200D histogram of orientation of lines 784D 3D-histogram Lab color space (14x14x4) Total [Hays and Efros, SIGGRAPH 2007] [Oliva and Torralba, 2006] [Torralba et al., PAMI, 2008]

C ATEGORIES 46

B INARY A TTRIBUTE R ELATIONSHIP 47

C OMPARATIVE A TTRIBUTE R ELATIONSHIPS 48

C LASSIFIERS Boosted Decision Trees – From [Hoiem et al., IJCV 2007] Scene – 20 Trees, 8 Nodes Binary Attribute – 40 Trees, 8 Nodes Comparative Attribute 20 Trees, 4 Nodes Differential features as in [Gupta and Davis, ECCV 2008] 49

S UPERVISION S UPERVISED A CTIVE L EARNING 50 [Torralba et al., PAMI 2008]

S UPERVISION S UPERVISED U NSUPERVISED A CTIVE L EARNING S EMI -S UPERVISED G RAPH - BASED M ETHODS Train Bus 51

M UTUAL E XCLUSION (ME) 52

E XPERIMENTAL E VALUATION Experiments Control Small-scale Large-scale Evaluation Metrics Scene Classification – Mean AP Purity of Labels Datasets SUN Database ImageNet 15 Scene Categories 53

B ASELINES Bootstrapping (Self-learning) -Binary Classifiers -Multi-class Classifiers Eigen Function based Graph Laplacian 54

S CENE C LASSIFICATION 55 Eigen Functions: [Fergus et al., NIPS 2009]

P URITY OF L ABELING 56 Eigen Functions: [Fergus et al., NIPS 2009]

57

1 40 Banquet Hall Iterations Seed Images 58

59

P URITY OF L ABELING 60 Eigen Functions: [Fergus et al., NIPS 2009]

S CENE C LASSIFICATION 61 Mean