# Multi-Local Feature Manifolds for Object Detection Oscar Danielsson Stefan Carlsson Josephine Sullivan

## Presentation on theme: "Multi-Local Feature Manifolds for Object Detection Oscar Danielsson Stefan Carlsson Josephine Sullivan"— Presentation transcript:

Multi-Local Feature Manifolds for Object Detection Oscar Danielsson (osda02@csc.kth.se) Stefan Carlsson (stefanc@csc.kth.se) Josephine Sullivan (sullivan@csc.kth.se) DICTA08

The Problem Object categories are often modeled by collections (bag-of-features) or constellations (pictorial structures) of local features Many simple, shape-based objects don’t have any discriminative local appearance features ?

The Multi-Local Feature  A specific spatial constellation of oriented edgels (or other local content)  Captures global shape properties  “Weak” detector of shape-based object categories  Described by coordinate vector: (x 1,…,x 12 )

Modeling Intra-Class Variation

1. Generate coordinate vectors by clicking corresponding edgels in a (small) number of training images 2. Align coordinate vectors wrt. similarity transform

Modeling Intra-Class Variation 3. Extend coordinate vectors into their convex hull

Detection

For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Detection For each occurrence x 1 of c 1 For each consistent occurrence x 2 of c 2 Sample from p(x 4,x 3 |x 2,x 1 ) to hypothesize image locations of c 3 and c 4 Sample image edgels Compute normalized distance to convex hull of training features If distance is below threshold, an instance was detected End For

Experiments Detection performance was evaluated on a standard database (ETHZ Shape Classes) and we want to investigate:  Is a multi-local feature a good weak detector?  How many local features should be used?

Mugs - Training 3 1 8 10 149 7 1213 2 6 11 5 4 3 1 8 10 14 9 7 12 13 2 6 11 5 4 25 training images were downloaded from Google images 14 edgels constituting a multilocal feature were marked in each training image

Mugs - Results

Performance decreases when adding more than 9 local features 0.4 60.6 %

Bottles - Training 12 1 10 7 11 9 8 6 2 5 3 4 1 10 7 11 9 8 6 2 5 3 4 12 25 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image

Bottles - Results

0.4 72.7 %

Apple logos - Training 20 training images were downloaded from Google images 12 edgels constituting a multilocal feature were marked in each training image

Apple logos - Results

Performance decreases when adding more than 11 local features 0.4 77.3 %

Conclusions  A multi-local feature is a good weak detector of shape-based object categories  The best performance is achieved with multi- local features with a moderate number of local features  Convex combinations of valid exemplars are in general also valid exemplars (we can extend a few training examples into their convex hull)

Future Work  Automatic learning of multi-local features  Building combinations of multi-local feature detectors into an object detection system

Related Work  Pictorial Structures  E.g.. Felzenszwalb, Huttenlocher. Pictorial Structures for Object Recognition, IJCV No. 1, January 2005.  Constellation Models  E.g.. Fergus, Perona, Zisserman. Object class recognition by unsupervised scale-invariant learning, CVPR03. Differences  Different detection methods  Use rich local features

Thanks!

Representation The multi-local feature manifold consists of all convex combinations of the training examples

Download ppt "Multi-Local Feature Manifolds for Object Detection Oscar Danielsson Stefan Carlsson Josephine Sullivan"

Similar presentations