Presentation is loading. Please wait.

Presentation is loading. Please wait.

Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.

Similar presentations


Presentation on theme: "Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce."— Presentation transcript:

1 Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce

2 Introduction Invariant local descriptors => robust recognition of specific objects or scenes Recognition of textures and object classes => description of intra-class variation, selection of discriminant features, spatial relations texture recognitioncar detection

3 1.An affine-invariant texture recognition (CVPR’03) 2.A two-layer architecture for texture segmentation and recognition (ICCV’03) 3.Feature selection for object class recognition (ICCV’03) 4.Building affine-invariant part models for recognition Overview

4 Affine-invariant texture recognition Texture recognition under viewpoint changes and non-rigid transformations Use of affine-invariant regions –invariance to viewpoint changes –spatial selection => more compact representation, reduction of redundancy in texton dictionary [A sparse texture representation using affine-invariant regions, S. Lazebnik, C. Schmid and J. Ponce, CVPR 2003]

5 Spatial selection clustering each pixelclustering selected pixels

6 Overview of the approach

7 Harris detector Laplace detector Region extraction

8 Descriptors – Spin images

9 Signature and EMD Hierarchical clustering => Signature : Earth movers distance –robust distance, optimizes the flow between distributions –can match signatures of different size –not sensitive to the number of clusters S S = { ( m 1, w 1 ), …, ( m k, w k ) } SS’ D( S, S’ ) = [  i,j f ij d( m i, m’ j )] / [  i,j f ij ]

10 Database with viewpoint changes 20 samples of 10 different textures

11 Results Spin images Gabor-like filters

12 1.An affine-invariant texture recognition (CVPR’03) 2.A two-layer architecture for texture segmentation and recognition (ICCV’03) 3.Feature selection for object class recognition (ICCV’03) 4.Building affine-invariant part models for recognition Overview

13 A two-layer architecture Texture recognition + segmentation Classification of individual regions + spatial layout [A generative architecture for semi-supervised texture recognition, S. Lazebnik, C. Schmid, J. Ponce, ICCV 2003]

14 A two-layer architecture Modeling : 1.Distribution of the local descriptors (affine invariants) Gaussian mixture model estimation with EM, allows incorporating unsegmented images 2.Co-occurrence statistics of sub-class labels over affinely adapted neighborhoods Segmentation + Recognition : 1.Generative model for initial class probabilities 2.Co-occurrence statistics + relaxation to improve labels

15 Texture Dataset – Training Images T1 (brick)T2 (carpet)T3 (chair)T4 (floor 1) T5 (floor 2) T6 (marble)T7 (wood)

16 Effect of relaxation + co-occurrence Original image Top: before relaxation (indivual regions), bottom: after relaxation (co-occurrence)

17 Recognition + Segmentation Examples

18 Animal Dataset – Training Images no manual segmentation, weakly supervised 10 training images per animal (with background) no purely negative images

19 Recognition + Segmentation Examples

20 1.An affine-invariant texture recognition (CVPR’03) 2.A two-layer architecture for texture segmentation and recognition (ICCV’03) 3.Feature selection for object class recognition (ICCV’03) 4.Building affine-invariant part models for recognition Overview

21 Object class detection/classification Description of intra-class variations of object parts [Selection of scale inv. regions for object class recognition, G. Dorko and C. Schmid, ICCV’03]

22 Object class detection/classification Description of intra-class variations of object parts Selection of discrimiant features (weakly supervised)

23 Training the model Training phase 1 –Input : Images of the object with background (positive images), no normalization, alignment of the image –Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT –Clustering : estimation of Gaussian mixture with EM

24 Training the model Training phase 1 –Input : Images of the object with background (positive images), no normalization, alignment of the image/object –Extraction of local descriptors : Harris-Laplace, Kadir-Brady, SIFT –Clustering : estimation of Gaussian mixture with EM

25 Training the model Training phase 2 (selection) –Input : verification set, positive and negative images –Rank each cluster with likelihood (or mutual information) –MAP classifier with the n top clusters

26 5 LikelihoodMutual Information 25 Likelihood – mutual information –likelihood: more discriminant but very specific –mutual Information: discriminant but not too specific

27 Results for test images Harris-Laplace 354 points49 correct + 37 incorrect31 correct + 20 incorrect 25 Likelihood10 Mutual InformationDetection Harris-Laplace 277 points43 correct + 36 incorrect26 correct + 20 incorrect

28 Relaxation – propagation of probablities

29 Classification Assign each test descriptor to the most probable cluster (MAP) Each descriptor assigned to one of the top n clusters is positive If the number of positive descriptors are above a threshold p classify the image as positive

30 Classification experiments AirplanesMotorbikes Wild Cats Training Phase 1 #Positive images200 25 Training Phase 2 #Positive images200 25 #Negative images450 Testing #Positive images400 50 #Negative images450 Training Verification Test http://www.robots.ox.ac.uk/~vgg/dataCorel Image Library

31 Results: Motorbikes Equal-Error-Rates as a function of p. Receiver-Operating-Characteristic p=6

32 BestEstimated pp=6Fergus p%p%% Airplanes Harris 897,559797.25- Kadir 18973096.59694 Motorbikes Harris 99959898.25- Kadir 1998.753298.259896 Wild Cats Harris 3194349272- Kadir 178645828490 97.5 99 94 Classification results: ROC equal error rates

33 1.An affine-invariant texture recognition (CVPR’03) 2.A two-layer architecture for texture segmentation and recognition (ICCV’03) 3.Feature selection for object class recognition (ICCV’03) 4.Building affine-invariant part models for recognition Overview

34 Matching collections of local affine-invariant regions that map with an affine transformation => part Matching works for unsegmented images Model = a collection of parts A Affine-invariant part models

35 Matching: Faces spurious match

36 Matching: 3D Objects closeup

37 Matching: Finding Repeated Patterns

38 Matching: Finding Symmetry

39 Modeling for Recognition Match multiple pairs of training images to produce several candidate parts. Use additional validation images to evaluate repeatability of parts and individual patches. Retain a fixed number of parts having the best repeatability score as class model. No background model

40 The Butterfly Dataset 16 training images (8 pairs) per class 10 validation images per class 437 test images 619 images total

41 Butterfly Models Top two rows: pairs of images used for modeling. Bottom two rows: closeup views of some of the parts making up the models of the seven butterfly classes.

42 Recognition Top 10 models per class used for recognition Multi-class classification results: total model size (smallest/largest)

43 Classification Rate vs. Number of Parts Number of parts

44 Successful Detection Examples Model part Yellow: detected in test image Blue: occluded in test image Test image: All ellipses Test image: Matched ellipses Note: only one of the two training images is shown

45 Successful Detection Examples (cont.)

46 Detection of Multiple Instances

47 Detection Failures

48 Future Work Spatial relation –non-rigid models –relations between clusters and affine-invariant parts Feature selection: dimensionality reduction Shape information: appropriate descriptors Rapid search: structuring of the data


Download ppt "Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce."

Similar presentations


Ads by Google