1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.

Slides:



Advertisements
Similar presentations
Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.
Advertisements

Multi-layer Orthogonal Codebook for Image Classification Presented by Xia Li.
TP14 - Local features: detection and description Computer Vision, FCUP, 2014 Miguel Coimbra Slides by Prof. Kristen Grauman.
MIT CSAIL Vision interfaces Approximate Correspondences in High Dimensions Kristen Grauman* Trevor Darrell MIT CSAIL (*) UT Austin…
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Computer Vision for Human-Computer InteractionResearch Group, Universität Karlsruhe (TH) cv:hci Dr. Edgar Seemann 1 Computer Vision: Histograms of Oriented.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Image alignment Image from
Pedestrian Detection in Crowded Scenes Dhruv Batra ECE CMU.
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Fitting: The Hough transform
Recognition using Regions CVPR Outline Introduction Overview of the Approach Experimental Results Conclusion.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Lecture 28: Bag-of-words models
Image alignment.
A Study of Approaches for Object Recognition
1 Model Fitting Hao Jiang Computer Science Department Oct 8, 2009.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Bag-of-features models
Spatial Pyramid Pooling in Deep Convolutional
Fitting: The Hough transform
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Keypoint-based Recognition and Object Search
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Indexing Techniques Mei-Chen Yeh.
Exercise Session 10 – Image Categorization
Image alignment.
Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.
Perceptual and Sensory Augmented Computing Visual Object Recognition Tutorial Visual Object Recognition Bastian Leibe & Computer Vision Laboratory ETH.
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
Face detection Slides adapted Grauman & Liebe’s tutorial
Fitting : Voting and the Hough Transform Monday, Feb 14 Prof. Kristen Grauman UT-Austin.
Visual Object Recognition
Object Detection 01 – Advance Hough Transformation JJCAO.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Class-Specific Hough Forests for Object Detection Zhen Yuan Hsu Advisor:S.J.Wang Gall, J., Lempitsky, V.: Class-specic hough forests for object detection.
Fitting: The Hough transform
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Project 3 Results.
Methods for classification and image representation
Fitting Thursday, Sept 24 Kristen Grauman UT-Austin.
1 Model Fitting Hao Jiang Computer Science Department Sept 30, 2014.
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Project 3 questions? Interest Points and Instance Recognition Computer Vision CS 143, Brown James Hays 10/21/11 Many slides from Kristen Grauman and.
Local features: detection and description
CS654: Digital Image Analysis
776 Computer Vision Jan-Michael Frahm Spring 2012.
Hough Transform CS 691 E Spring Outline Hough transform Homography Reading: FP Chapter 15.1 (text) Some slides from Lazebnik.
Another Example: Circle Detection
TP12 - Local features: detection and description
Paper Presentation: Shape and Matching
Fitting: Voting and the Hough Transform (part 2)
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Object detection as supervised classification
CS 1674: Intro to Computer Vision Scene Recognition
CAP 5415 Computer Vision Fall 2012 Dr. Mubarak Shah Lecture-5
Brief Review of Recognition + Context
Presentation transcript:

1 Image Recognition - I

Global appearance patterns Slides by K. Grauman, B. Leibe

Global appearance, windowed detectors: The good things  Some classes well-captured by 2d appearance pattern  Simple detection protocol to implement  Good feature choices critical  Past successes for certain classes 3Slides by K. Grauman, B. Leibe

Limitations  High computational complexity  For example: 250,000 locations x 30 orientations x 4 scales = 30,000,000 evaluations!  With so many windows, false positive rate better be low  If training binary detectors independently, means cost increases linearly with number of classes 4Slides by K. Grauman, B. Leibe

Limitations (continued)  Not all objects are “box” shaped 5Slides by K. Grauman, B. Leibe

Limitations (continued)  Non-rigid, deformable objects not captured well with representations assuming a fixed 2d structure; or must assume fixed viewpoint  Objects with less-regular textures not captured well with holistic appearance-based descriptions 6K. Grauman, B. Leibe Slides by K. Grauman, B. Leibe

Limitations (continued)  If considering windows in isolation, context is lost 7 Figure credit: Derek Hoiem Sliding windowDetector’s view Slides by K. Grauman, B. Leibe

Limitations (continued)  In practice, often entails large, cropped training set (expensive)  Requiring good match to a global appearance description can lead to sensitivity to partial occlusions 8 Image credit: Adam, Rivlin, & Shimshoni Slides by K. Grauman, B. Leibe

Models based on local features will alleviate some of these limitations… Slides by K. Grauman, B. Leibe

Recall: Local feature extraction … Detect or sample features Quantize to form bag of words vector for the image … Describe features … Slides by K. Grauman, B. Leibe

Indexing with bags-of-words Database of frames Query Measure similarity to all database items, rank. Slides by K. Grauman, B. Leibe

Categorization with bags-of-words  Let each bag of words histogram be a feature vector  Use images from each class to train a classifier (e.g., an SVM) Violins Non- violins Slides by K. Grauman, B. Leibe

Sampling strategies  Reliable local feature matches well-suited for recognition of instances (specific objects, scenes). Even a few (sparse) strong matches can be a good indicator for moderately- sized databases. Slides by K. Grauman, B. Leibe

Sampling strategies  For category-level recognition, we can’t necessarily rely on having such exact feature matches; sparse selection of features may leave more ambiguity. Slides by K. Grauman, B. Leibe

Sampling strategies Dense, uniformly Sparse, at interest points Randomly Image credits: F-F. Li, E. Nowak, J. Sivic Multiple interest operators Some rules of thumb: To find specific, textured objects, sparse sampling from interest points often more reliable. Multiple complementary interest operators offer more image coverage. For object categorization, dense sampling often offers better coverage. Slides by K. Grauman, B. Leibe

Categorization with bags-of-words bag of features Have been shown to perform well in practice. Source: Lana Lazebnik

Slide by Bill Freeman, MIT The bag of words removes spatial layout.

Introducing some loose spatial information  A representation “in-between” orderless bags of words and global appearance: a spatial pyramid of bags-of- words. Lazebnik, Schmid & Ponce, CVPR 2006 Slides by K. Grauman

 Can capture scene categories well---texture-like patterns but with some variability in the positions of all the local pieces. Lazebnik, Schmid & Ponce, CVPR 2006 Introducing some loose spatial information Slides by K. Grauman

 Can capture scene categories well---texture-like patterns but with some variability in the positions of all the local pieces. Lazebnik, Schmid & Ponce, CVPR 2006 Introducing some loose spatial information Confusion table Slides by K. Grauman

 What will a grid binning of features over the whole image be sensitive to? Introducing some loose spatial information Slides by K. Grauman

Part-based models  Represent a category by common parts and their layout Slides by K. Grauman

Part-based models: questions Some categories are well-defined by a collection of parts and their relative positions  1) How to represent, learn, and detect such models? We’ll look at two models:  Generalized Hough with words (“Implicit Shape Model”)  Probabilistic generative model of parts & appearance (“Constellation model”)  2) How can we learn these models in the presence of clutter? Slides by K. Grauman

Implicit shape models Visual vocabulary is used to index votes for object position [a visual word = “part”] B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004Combined Object Categorization and Segmentation with an Implicit Shape Model training image visual codeword with displacement vectors

Implicit shape models Visual vocabulary is used to index votes for object position [a visual word = “part”] B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004Combined Object Categorization and Segmentation with an Implicit Shape Model test image

Implicit shape models: Training 1.Build vocabulary of patches around extracted interest points using clustering Slides by K. Grauman

Implicit shape models: Training 1.Build vocabulary of patches around extracted interest points using clustering 2.Map the patch around each interest point to closest word Slides by K. Grauman

Implicit shape models: Training 1.Build vocabulary of patches around extracted interest points using clustering 2.Map the patch around each interest point to closest word 3.For each word, store all positions it was found, relative to object center Slides by K. Grauman

Implicit shape models: Testing 1.Given new test image, extract patches, match to vocabulary words 2.Cast votes for possible positions of object center 3.Search for maxima in voting space 4.(Extract weighted segmentation mask based on stored masks for the codebook occurrences) What is the dimension of the Hough space? Slides by K. Grauman

Implicit shape models: Testing Slides by K. Grauman

31 Original image Example: Results on Cows Slides by K. Grauman, B. Leibe

32 Slides by K. Grauman, B. Leibe Interest points Example: Results on Cows

33Slides by K. Grauman, B. Leibe Matched patches Example: Results on Cows

34 Prob. Votes Example: Results on Cows Slides by K. Grauman, B. Leibe

35 1 st hypothesis Example: Results on Cows Slides by K. Grauman, B. Leibe

36 2 nd hypothesis Example: Results on Cows Slides by K. Grauman, B. Leibe

37K. Grauman, B. Leibe Example: Results on Cows 3 rd hypothesis

 Scale-invariant feature selection  Scale-invariant interest points  Rescale extracted patches  Match to constant-size codebook  Generate scale votes  Scale as 3 rd dimension in voting space Scale Invariant Voting Search window x y s Slides by K. Grauman, B. Leibe

39 Detection Results  Qualitative Performance  Recognizes different kinds of objects  Robust to clutter, occlusion, noise, low contrast Slides by K. Grauman, B. Leibe