Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08.

Similar presentations


Presentation on theme: "1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08."— Presentation transcript:

1 1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08

2 Understanding an Image slide by Fei Fei, Fergus & Torralba

3 Object naming -> Object categorization sky building flag wall banner bus cars bus face street lamp slide by Fei Fei, Fergus & Torralba

4 Object categorization sky building flag wall banner bus cars bus face street lamp

5 5 Classical View of Categories Dates back to Plato & Aristotle 1. Categories are defined by a list of properties shared by all elements in a category 2. Category membership is binary 3. Every member in the category is equal

6 6 Problems with Classical View Humans don’t do this! – People don’t rely on abstract definitions / lists of shared properties (Rosch 1973) e.g. Are curtains furniture? – Typicality e.g. Chicken -> bird, but bird -> eagle, pigeon, etc. – Intransitivity e.g. car seat is chair, chair is furniture, but … – Not language-independent e.g. “Women, Fire, and Dangerous Things” category is Australian aboriginal language (Lakoff 1987) –Doesn’t work even in human-defined domains e.g. Is Pluto a planet?

7 Problems with Visual Categories Chair A lot of categories are functional Car Different views of same object can be visually dis-similar

8 8 Categorization in Modern Psychology Prototype Theory (Rosch 1973) –One or more summary representations (prototypes) for each category –Humans compute similarity between input and prototypes Exemplar Theory (Medin & Schaffer 1978, Nosofsky 1986, Krushke 1992) –categories represented in terms of remembered objects (exemplars) –Similarity is measured between input and all exemplars –think non-parametric density estimation 8

9 Different way of looking at recognition Car Road Building Input Image

10 10 Different way of looking at recognition

11 11 What is the ultimate goal? Parsing Images A “what is it like?” machine A kind of “visual memex”

12 12 Recognition as Association LabelMe Dataset 12,905 Object Exemplars 171 unique ‘labels’

13 13 Our Contributions Posing Recognition as Association –Use large number of object exemplars 13 Learning Object Similarity –Different distance function per exemplar Recognition-Based Object Segmentation –Use multiple segmentation approach

14 14 Measuring Similarity

15 15 Exemplar Representation Segment from LabelMe

16 16 Shape Centered Mask Bounding Box Dimensions Pixel Area Obj ~Obj

17 17 Texture Textons top,bot,left,right boundary Interior: Bag-of-Words

18 18 Color Mean Color Color std Color Histogram

19 19 Location Centered Mask Bounding Box Dimensions Pixel Area Obj !Obj Textons top,bot,left,right boundary Interior: Bag-of-Words Mean Color Color std Color Histogram Absolute Position Mask Top Height Bot Height

20 Distance “Similarity” Functions Positive Linear Combinations of Elementary Distances Computed Over 14 Features Building e Distance Function Building e

21 Learning Object Similarity Learn a different distance function for each exemplar in training set Formulation is similar to Frome et al [1,2] [1] Andrea Frome, Yoram Singer, Jitendra Malik. "Image Retrieval and Recognition Using Local Distance Functions." In NIPS, [2] Andrea Frome, Yoram Singer, Fei Sha, Jitendra Malik. "Learning Globally-Consistent Local Distance Functions for Shape-Based Image Retrieval and Classification." In ICCV, 2007.

22 22 Non-parametric density estimation Color Dimension Shape Dimension Class 1 Class 2 Class 3

23 23 Non-parametric density estimation Color Dimension Shape Dimension Class 1 Class 2 Class 3

24 24 Non-parametric density estimation Color Dimension Shape Dimension Class 1 Class 2 Class 3

25 25 Learning Distance Functions 25 Dshape Dcolor Focal Exemplar

26 26 Learning Distance Functions 26 Dshape Dcolor Focal Exemplar “similar” side DecisionBoundary “dissimilar” side Don’t Care

27 Visualizing Distance Functions (Training Set) Query Top Neighbors with Tex-Hist Dist Top Neighbors with Learned Dist

28 Visualizing Distance Functions (Training Set)

29

30

31 person Visualizing Distance Functions (Training Set)

32 32 Visualizing Distance Functions (Training Set) person standing person woman perso n Different Label on “similar” side of distance function

33 Labels Crossing Boundary

34 34 “Conventional” Recognition in Test Set Compute the similarity between an input and all exemplars All exemplars with D < 1 are “associated” with the input Most occurring label from associations is propagated onto input Association confidence score favors more associations and smaller distances 34

35 Performance on labeling perfect segments (test set)

36 Object Segmentation via Recognition Generate Multiple Segmentations (Hoiem 2005, Russell 2006, Malisiewicz 2007) – Mean-Shift and Normalized Cuts – Use pairs and triplets of adjacent segments – Generate about 10,000 segments per image Enhance training with bad segments Apply learned distance functions to bottom- up segments

37 37 Example Associations Bottom-Up Segments

38 38 Quantitative Evaluation 38 Object hypothesis is correct if labels match and OS >.5 *We do not penalize for multiple correct overlapping associations OS(A,B) = Overlap Score = intersection(A,B) / union(A,B)

39 39 Toward Image Parsing 39

40 40 Conclusion: Main Points Object Association: defining an object in terms of a set of visually similar objects. Trying to get away from classes. Learning per-examplar-distances: each object gets to decide on its own distance function. Suddenly, NN distances are meaningful! Using multiple segmentations: partition the input image into manageable chunks than can then be matched

41 41 Thank You 41 Questions?


Download ppt "1 Recognition by Association: ask not “What is it?” ask “What is it like?” Tomasz Malisiewicz and Alyosha Efros CMU CVPR’08."

Similar presentations


Ads by Google