Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.

Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006

2 Object-class recognition: What are we missing? Are current feature types sufficient? Biology suggests that local features are sufficient –Patches, but with arbitrary shapes –Contours, texture, color, motion,… –Features can incorporate localization Next step: learn feature types instead of hand-crafting them What else are we missing? Hypothesis: One key is more and better training data Hierarchical categories: Recognize “mammal” then “dog”

3 Simple biologically-motivated system local max Image Layer -Pyramid of rescaled images. -Model is multiscale from the outset. S1 Layer -Compute responses to Gabor filters (4 orientations). -At each position/scale, now have 4 types of units: C1 Layer -Compute local maxima for each feature type (orientation); subsample. -Units now have some position/scale invariance:

4 Computing Features, cont’d [ r 1 r 2 … r 4075 ] global max S2 Layer -Here, prototypes are patches of C1 units sampled from training images (4,075). -Note: no clustering needed C2 Layer -Maximum response to each S2 feature. -4,075 features. SVM classifier x y orientation

5 Location Information in Features Should we be using a “bag of features”? Sacrifice position/scale invariance; objects must be centered. –Assume centered objects for Caltech 101 –Other datasets use a sliding window. Our approach: an S2 feature isn’t computed for the entire image, but only “near” its originally sampled location.

6 Classification Results (Caltech 101) Serre et al. [05]42 Base41 + sparse S2 + 12 orientations + inhibited S1/C1 49 (+ 8) + localized features54 (+ 5) + feature selection56 (+ 2) Classification rates for the full dataset: (30 training images per category. Scores are the average of the per- category classification rates.) x

7 Localization Results (UIUC Cars) Correct examples: (single-scale) The only errors in 8 runs (1600 cars)

8 Interest points vs. dense features Disadvantages of interest points –Selection of interest points is error prone, so reliability of matching is lower than for dense features –Feature size and shape are predetermined Disadvantages of dense features –Efficiency. However, cost is reduced by increasing invariance at lower levels. My conclusion: –Interest points are useful for near-term efficiency. –Dense features are likely to win long-term.

9 Objection: how can human vision recognize objects after just a single example? –First category (e.g., mammal) is recognized, then instance –Features for categorization are borrowed from most similar previous category Hypothesis: Object class recognition would already be solved if training sets were large enough Performance increases strongly with training set size Good success is achieved on difficult categories (faces, pedestrians) with large training sets However, large training sets require more efficient algorithms

Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.

Similar presentations

Presentation on theme: "Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006.

Similar presentations

Presentation on theme: "Representations for object class recognition David Lowe Department of Computer Science University of British Columbia Vancouver, Canada Sept. 21, 2006."— Presentation transcript:

Similar presentations

About project

Feedback