NEIL: Extracting Visual Knowledge from Web Data Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta Carnegie Mellon University CS381V Visual Recognition -

Slides:

Advertisements

Similar presentations

Attributes for Classifier Feedback Amar Parkash and Devi Parikh.

Advertisements

Learning Semantics with Less Supervision

Adding Unlabeled Samples to Categories by Learned Attributes Jonghyun Choi Mohammad Rastegari Ali Farhadi Larry S. Davis PPT Modified By Elliot Crowley.

Machine learning continued Image source:

UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.

C ONSTRAINED S EMI -S UPERVISED L EARNING USING A TTRIBUTES AND C OMPARATIVE A TTRIBUTES Abhinav Shrivastava, Saurabh Singh, Abhinav Gupta The Robotics.

Beyond Mindless Labeling: Really Leveraging Humans to Build Intelligent Machines Devi Parikh Virginia Tech.

Lecture 31: Modern object recognition

Data-driven Visual Similarity for Cross-domain Image Matching

Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.

Large-Scale Object Recognition with Weak Supervision

Ghunhui Gu, Joseph J. Lim, Pablo Arbeláez, Jitendra Malik University of California at Berkeley Berkeley, CA

Coupling Semi-Supervised Learning of Categories and Relations by Andrew Carlson, Justin Betteridge, Estevam R. Hruschka Jr. and Tom M. Mitchell School.

Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.

1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

COS 429 PS5: Finding Nemo. Exemplar -SVM Still a rigid template,but train a separate SVM for each positive instance For each category it can has exemplar.

Predicting Matchability - CVPR 2014 Paper -

Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.

Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.

Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.

Learning Everything about Anything: Webly- Supervised Visual Concept Learning Santosh K. Divvala, Ali Farhadi, Carlos Guestrin.

© 2013 IBM Corporation Efficient Multi-stage Image Classification for Mobile Sensing in Urban Environments Presented by Shashank Mujumdar IBM Research,

Generic object detection with deformable part-based models

Comp 5013 Deep Learning Architectures Daniel L. Silver March,

CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 35 – Review for midterm.

Object Bank Presenter ： Liu Changyu Advisor ： Prof. Alex Hauptmann Interest ： Multimedia Analysis April 4 th, 2013.

Why Categorize in Computer Vision ?. Why Use Categories? People love categories!

Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Detection, Segmentation and Fine-grained Localization

Beauty is Here! Evaluating Aesthetics in Videos Using Multimodal Features and Free Training Data Yanran Wang, Qi Dai, Rui Feng, Yu-Gang Jiang School of.

Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.

Deformable Part Models (DPM) Felzenswalb, Girshick, McAllester & Ramanan (2010) Slides drawn from a tutorial By R. Girshick AP 12% 27% 36% 45% 49% 2005.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.

Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC

Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:

Machine learning Image source:

Image Captioning Approaches

In: Pattern Analysis and Machine Intelligence, IEEE Transactions on, Vol. 30, Nr. 1 (2008), p Group of Adjacent Contour Segments for Object Detection.

Week 4: 6/6 – 6/10 Jeffrey Loppert. This week.. Coded a Histogram of Oriented Gradients (HOG) Feature Extractor Extracted features from positive and negative.

Week 3 Emily Hand UNR. Online Multiple Instance Learning The goal of MIL is to classify unseen bags, instances, by using the labeled bags as training.

Scale Up Video Understanding with Deep Learning May 30, 2016 Chuang Gan Tsinghua University 1.

Recent developments in object detection

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Machine learning Image source:

Object detection with deformable part-based models

Data Driven Attributes for Action Detection

Week III: Deep Tracking

Finding Things: Image Parsing with Regions and Per-Exemplar Detectors

Part-Based Room Categorization for Household Service Robots

Introductory Seminar on Research: Fall 2017

Object Localization Goal: detect the location of an object within an image Fully supervised: Training data labeled with object category and ground truth.

Structured Predictions with Deep Learning

Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT

By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,

Object detection as supervised classification

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science

Rob Fergus Computer Vision

Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

RGB-D Image for Scene Recognition by Jiaqi Guo

Object Detection + Deep Learning

Adaptive object recognition in RGBz images

Large Scale Image Deduplication

The topic discovery models

KFC: Keypoints, Features and Correspondences

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Data Mining, Machine Learning, Data Analysis, etc. scikit-learn

Airport Parking Space Navigation

Presentation transcript:

NEIL: Extracting Visual Knowledge from Web Data Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta Carnegie Mellon University CS381V Visual Recognition - Paper Presentation

How to Train Object Detectors? …. Collect and label training data Train Detectors …. Time consuming Expensive

At a Large Scale? 800M+ images daily!

NEIL: Never Ending Image Learner Running 24 hours a day, 7 days a week Semantically understand Web images Build visual knowledge base Continue to build better classifiers and detectors

NEIL’s Knowledge Base Visual Instances Labeled by NEIL Object categories Scenes Attributes Relationships Object - Objects Scenes - Objects Objects – Attributes Scenes – Attributes

Objects Camry Slide credit: Xinlei Chen

Scenes Parking Lot Raceway Slide credit: Xinlei Chen

Attributes Round Shape Crowded Slide credit: Xinlei Chen

Object - Object Partonomy Taxonomy or Similarity Wheel is a part of Car Corolla is a kind of/looks similar to of Car Relationships Slide credit: Xinlei Chen

Object - Scene Car is found in Raceway Relationships Slide credit: Xinlei Chen

Object - Attribute Wheel is/has Round shape Relationships Slide credit: Xinlei Chen

Scene – Attribute Bamboo forest is/has Vertical lines Relationships Slide credit: Xinlei Chen

(0) Seed Images Desktop Computer Monitor Keyboard 1.No Bounding-boxes 2.Noise 3.Multiple Meanings (Polysemy) Slide credit: Xinlei Chen

(0) Seed Images Desktop Computer Monitor Keyboard (1) (2) (3) Desktop Computer (1) (2) (3) Monitor (1) (2) (3) Keyboard (1) (2) (3) Television (1) Subcategory Discovery Slide credit: Xinlei Chen

Car Slide credit: Xinlei Chen Exemplar Detectors

Affinity Graph Slide credit: Xinlei Chen

Falcon Slide credit: Xinlei Chen Polysemy

(0) Seed Images Desktop Computer Monitor Keyboard (1) (2) (3) Desktop Computer (1) (2) (3) Monitor (1) (2) (3) Keyboard (1) (2) (3) Television Desktop Computer (1) Desktop Computer (2) Desktop Computer (3) … Monitor (1) … (1) Subcategory Discovery (2) Train Models Slide credit: Xinlei Chen

Train Models Latent SVM Objects, Attributes CHOG Linear SVM Scenes, Attributes Color, Texton, HOG, SIFT, GIST … Your model? Slide credit: Xinlei Chen

(1) (2) (3) Desktop Computer (1) (2) (3) Monitor (1) (2) (3) Keyboard (1) (2) (3) Television Desktop Computer (1) Desktop Computer (2) Desktop Computer (3) … Monitor (1) … (1) Subcategory Discovery (2) Train Models (3) Relationship Discovery (0) Seed Images Desktop Computer Monitor Keyboard Slide credit: Xinlei Chen

Relationship Discovery Keyboard is a part of Desktop Computer Monitor is a part of Desktop Computer Television looks similar to Monitor Learned relationships: N Concepts Keyboard Desktop Computer Keyboard Desktop Computer Macro Vision Slide credit: Xinlei Chen

(1) (2) (3) Desktop Computer (1) (2) (3) Monitor (1) (2) (3) Keyboard (1) (2) (3) Television Keyboard is a part of Desktop Computer Monitor is a part of Desktop Computer Television looks similar to Monitor Learned relationships: Desktop Computer (1) Desktop Computer (2) Desktop Computer (3) … Monitor (1) … (1) Subcategory Discovery (2) Train Models (3) Relationship Discovery (0) Seed Images Desktop Computer Monitor Keyboard Slide credit: Xinlei Chen

(2) Retrain Models (1) (2) (3) Desktop Computer (1) (2) (3) Monitor (1) (2) (3) Keyboard (1) (2) (3) Television Desktop Computer (1) Desktop Computer (2) Desktop Computer (3) … Monitor (1) … (1) Subcategory Discovery (2) Train Models (3) Relationship Discovery Desktop ComputerMonitorTelevision (4) Add New Instances (0) Seed Images Desktop Computer Monitor Keyboard Keyboard is a part of Desktop Computer Monitor is a part of Desktop Computer Television looks similar to Monitor Learned relationships: Slide credit: Xinlei Chen

Experimental Results Scene Classification A dataset of 600 images 12 scene categories Flickr images mAP Seed Classifier (15 Google Images)0.52 Bootstrapping (without relationships)0.54 NEIL Scene Classifiers0.57 NEIL (Classifiers + Relationships)0.62

Experimental Results Object Detection A dataset of 1000 images 15 object categories Flickr images mAP Latent SVM (450 Google Images)0.28 Latent SVM (450, Aspect Ratio Clustering)0.30 Latent SVM (450, HOG-based Clustering)0.33 Seed Detector (NEIL Clustering)0.44 Bootstrapping (without relationships)0.45 NEIL Detector0.49 NEIL Detector + Relationships0.51

Examples of Bounding Box Labeling

Examples of extracted common sense relationships

Discussion Points When do trained detectors converge Incorporate language models How to utilize deep neural networks for NEIL Train models on Internet images and test on existing benchmarks, possibly add domain adaptation to improve performance Utilize google extended image search for subcategory discovery

Related Work Xinlei Chen, Abhinav Shrivastava, Abhinav Gupta. Enriching Visual Knowledge Bases via Object Discovery and Segmentation. CVPR 2014.

Related Work Chen Xinlei, and Abhinav Gupta. Webly supervised learning of convolutional networks. ICCV 2015

Related Work Divvala, Santosh, Ali Farhadi, and Carlos Guestrin. Learning everything about anything: Webly- supervised visual concept learning. CVPR 2014

Related Work Carlson, Andrew, Justin Betteridge, Bryan Kisiel, Burr Settles, Estevam R. Hruschka Jr, and Tom M. Mitchell. Toward an Architecture for Never-Ending Language Learning. AAAI (NELL)