Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.

Slides:



Advertisements
Similar presentations
Semantic Contours from Inverse Detectors Bharath Hariharan et.al. (ICCV-11)
Advertisements

Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.
Lecture 31: Modern object recognition
Many slides based on P. FelzenszwalbP. Felzenszwalb General object detection with deformable part-based models.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Analyzing Semantic Segmentation Using Hybrid Human-Machine CRFs Roozbeh Mottaghi 1, Sanja Fidler 2, Jian Yao 2, Raquel Urtasun 2, Devi Parikh 3 1 UCLA.
More sliding window detection: Discriminative part-based models Many slides based on P. FelzenszwalbP. Felzenszwalb.
Discriminative and generative methods for bags of features
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Recognition: A machine learning approach
Small Codes and Large Image Databases for Recognition CVPR 2008 Antonio Torralba, MIT Rob Fergus, NYU Yair Weiss, Hebrew University.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Statistical Recognition Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Kristen Grauman.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
EECS 442 – Computer vision Segments of this lectures are courtesy of Prof F. Li, R. Fergus and A. Zisserman Databases for object recognition and beyond.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition: History and Overview Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and Jean Ponce.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Transferring information using Bayesian priors on object categories Li Fei-Fei 1, Rob Fergus 2, Pietro Perona 1 1 California Institute of Technology, 2.
Bag-of-features models
Lecture 17: Parts-based models and context CS6670: Computer Vision Noah Snavely.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Visual Object Recognition Rob Fergus Courant Institute, New York University
Extreme, Non-parametric Object Recognition 80 million tiny images (Torralba et al)
Opportunities of Scale, Part 2 Computer Vision James Hays, Brown Many slides from James Hays, Alyosha Efros, and Derek Hoiem Graphic from Antonio Torralba.
Lecture 29: Recent work in recognition CS4670: Computer Vision Noah Snavely.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Crash Course on Machine Learning
Salient Object Detection by Composition
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Computer Vision CS 776 Spring 2014 Recognition Machine Learning Prof. Alex Berg.
“Secret” of Object Detection Zheng Wu (Summer intern in MSRNE) Sep. 3, 2010 Joint work with Ce Liu (MSRNE) William T. Freeman (MIT) Adam Kalai (MSRNE)
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Why Categorize in Computer Vision ?. Why Use Categories? People love categories!
Last part: datasets and object collections. CMU/MIT frontal facesvasc.ri.cmu.edu/idb/html/face/frontal_images cbcl.mit.edu/software-datasets/FaceData2.html.
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
A Statistically Selected Part-Based Probabilistic Model for Object Recognition Zhipeng Zhao, Ahmed Elgammal Department of Computer Science, Rutgers, The.
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Object Detection with Discriminatively Trained Part Based Models
Lecture 31: Modern recognition CS4670 / 5670: Computer Vision Noah Snavely.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
UNBIASED LOOK AT DATASET BIAS Antonio Torralba Massachusetts Institute of Technology Alexei A. Efros Carnegie Mellon University CVPR 2011.
Bayesian Parameter Estimation Liad Serruya. Agenda Introduction Bayesian decision theory Scale-Invariant Learning Bayesian “One-Shot” Learning.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
11/26/2015 Copyright G.D. Hager Class 2 - Schedule 1.Optical Illusions 2.Lecture on Object Recognition 3.Group Work 4.Sports Videos 5.Short Lecture on.
Object detection, deep learning, and R-CNNs
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Christopher M. Bishop Object Recognition: A Statistical Learning Perspective Microsoft Research, Cambridge Sicily, 2003.
Recognition Using Visual Phrases
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT Recognizing and Learning Object Categories ICCV 2005 Beijing, Short Course, Oct 15.
Pattern recognition – basic concepts. Sample input attribute, attribute, feature, input variable, independent variable (atribut, rys, příznak, vstupní.
Machine learning & object recognition Cordelia Schmid Jakob Verbeek.
Object detection with deformable part-based models
Paper Presentation: Shape and Matching
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
HOGgles Visualizing Object Detection Features
Outline Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no.
Recognizing and Learning Object categories
“The Truth About Cats And Dogs”
Brief Review of Recognition + Context
Faster R-CNN By Anthony Martinez.
Generic object recognition
Presentation transcript:

Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman

Issues in recognition The statistical viewpoint Generative vs. discriminative methods Model representation Generalization, bias vs. variance Supervision Datasets

Object categorization: the statistical viewpoint vs. MAP decision:

Object categorization: the statistical viewpoint vs. Bayes rule: posterior likelihood prior MAP decision:

Object categorization: the statistical viewpoint Discriminative methods: model posterior Generative methods: model likelihood and prior posterior likelihood prior

Discriminative methods Direct modeling of Zebra Non-zebra Decision boundary

Model and Generative methods LowMiddle HighMiddle  Low

Generative vs. discriminative methods Generative methods + Interpretable + Can be learned using images from just a single category – Sometimes we don’t need to model the likelihood when all we want is to make a decision Discriminative methods + Efficient + Often produce better classification rates – Can be hard to interpret – Require positive and negative training data

Steps for statistical recognition Representation –How to model an object category Learning –How to find the parameters of the model, given training data Recognition –How the model is to be used on novel data

Representation –Generative / discriminative / hybrid

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance –Invariances Viewpoint Illumination Occlusion Scale Deformation Clutter etc.

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance –Invariances –Global, sliding window, part- based

Representation –Generative / discriminative / hybrid –Appearance only or location and appearance –Invariances –Global, sliding window, part- based –If part-based, what is the spatial support for parts?

–Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning Learning

–Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) –Methods of training: generative vs. discriminative Learning

–Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) –Methods of training: generative vs. discriminative –Generalization, overfitting, bias vs. variance Learning

Generalization How well does a learned model generalize from the data it was trained on to a new test set?

Bias-variance tradeoff Models with too many parameters may fit a given sample better, but have high variance Generalization error is due to overfitting

Bias-variance tradeoff Models with too many parameters may fit a given sample better, but have high variance Generalization error is due to overfitting Models with too few parameters may not fit a given sample well because of high bias Generalization error is due to underfitting 2

Given several models that describe the data equally well, the simpler one should be preferred There should be some tradeoff between error and model complexity –This is rarely done rigorously, but is a powerful “rule of thumb” –Simpler models are often preferred because of their robustness (= low variance) Occam’s razor

–Unclear how to model categories, so we learn what distinguishes them rather than manually specify the difference -- hence current interest in machine learning) –Methods of training: generative vs. discriminative –Generalization, overfitting, bias vs. variance –Level of supervision Manual segmentation; bounding box; image labels; noisy labels Task-dependent Learning Contains a motorbike

Unsupervised “Weakly” supervised Supervised Definition depends on task

What task? Classification –Object present/absent in image –Background may be correlated with object Localization / Detection –Localize object within the frame –Bounding box or pixel- level segmentation

Datasets Circa 2001: 5 categories, 100s of images per category Circa 2004: 101 categories Today: thousands of categories, tens of thousands of images

Caltech 101 & 256 Griffin, Holub, Perona, 2007 Fei-Fei, Fergus, Perona,

The PASCAL Visual Object Classes Challenge ( ) 2008 Challenge classes: Person: person Animal: bird, cat, cow, dog, horse, sheep Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

The PASCAL Visual Object Classes Challenge ( ) Main competitions –Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image –Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image

The PASCAL Visual Object Classes Challenge ( ) “Taster” challenges –Segmentation: Generating pixel-wise segmentations giving the class of the object visible at each pixel, or "background" otherwise –Person layout: Predicting the bounding box and label of each part of a person (head, hands, feet)

Lotus Hill Research Institute image corpus Z.Y. Yao, X. Yang, and S.C. Zhu,

Labeling with games L. von Ahn, L. Dabbish, 2004; L. von Ahn, R. Liu and M. Blum,

Russell, Torralba, Murphy, Freeman, 2008 LabelMe

80 Million Tiny Images

Dataset issues How large is the degree of intra-class variability? How “confusable” are the classes? Is there bias introduced by the background? I.e., can we “cheat” just by looking at the background and not the object?

Caltech-101

Summary Recognition is the “grand challenge” of computer vision History –Geometric methods –Appearance-based methods –Sliding window approaches –Local features –Parts-and-shape approaches –Bag-of-features approaches Issues –Generative vs. discriminative models –Supervised vs. unsupervised methods –Tasks, datasets