Presentation is loading. Please wait.

Presentation is loading. Please wait.

NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering.

Similar presentations


Presentation on theme: "NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering."— Presentation transcript:

1 NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering Imperial College London Robotics Research Group Seminar, Oxford 27-02-06

2 Structure Motivation Challenges The NN k Idea NN k Networks Clustering Conclusions and Outlook

3 Motivation 1k10k> 1Mio1k25k65k850k>1Mio

4 Motivation Providing access to large image collections based on visual image content not textual annotation Why are we so keen on content? - extraction can be automated - text annotation not exhaustive

5 Challenges Representation: How to capture meaningful content in terms of visual attributes? Scalability: How to alleviate the O(n) complexity of one-one matching? Polysemy: Which semantic facet in an image is the user interested in? And what is the best feature for representing it?

6 Challenges: Polysemy Tieu’s Convolution Feature Colour Structure Descriptor Global HSV Histogram

7 Semantic disambiguation Relevance Feedback: Users label retrieved images according to perceived relevance 1. Parametrise distance metric 2. find k-NN with default parameters 3. fine tune parameters upon relevance feedback

8 The NN k Idea Motivated by problems with traditional relevance feedback: - What is the weight set for the first retrieval? - Little scope for exploratory search 1. Parametrise distance metric 2. For every parameter set, record the closest image to some query image (subsequently referred to as focal image) This set of nearest neighbours defines the NN k of the query 3. Let users decide which parameter set is best by choosing among the retrieved images By considering all possible weighted combinations of features, we are more likely to capture the different semantic facets of the image

9 The NN k Idea Distance metric: where the k components of w are non-negative and sum to one. The weight space is a bounded and continuous set. To find the NN k of an image, we sample the weight space at a finite number of points.

10 The NN k Idea k = 2k = 3

11 The NN k Idea Sampling the weight space

12

13

14

15 NN k Networks Construct a directed graph by linking each image of a collection to all its NN k. Arc weight (x -> y): proportion of w for which y is NN k of x, i.e. relative size of supporting weight space

16 NN k Networks: Properties Distances: approx. 4 links between any two images (100,000 images), logarithmic increase with collection size Clustering coefficient (Watts & Strogatz, 2000): a vertex’s neighbours are likely to be neighbours themselves Precomputed, hence interaction is in real time Construction in O(n 2 ) time (14 hours for 50,000 images)

17 NN k Networks: Clustering NN k represent a range of semantic facets users may be interested in Once we know the semantic facet, we would like to present users with more images of the same kind Cluster the graph and display the cluster that contains the focal image and the chosen NN k Problem: a partitioning of the vertices allows the focal image to belong to only one cluster. We want a soft clustering A soft clustering can be achieved by partitioning not the vertex set but the edge set of the graph (a vertex can now associate with as many clusters as it has edges to other vertices) Compare with “Google Sets” and Zoubin Ghahramani’s talk two weeks ago

18 NN k Networks: Clustering MCL: Markov Clustering (van Dongen, 2000) Use normalised adjacency matrix as the transition matrix A of a random walk 1. Simulate a random walk: A t+1 = (A t ) I 2. Amplify large transition probabilities by raising each element of A to some power 3. Normalise resulting matrix and go to step 1 Matrix A converges to a sparse matrix. Interpreted as an adjacency matrix, it corresponds to a partitioning of the graph. MCL fast and suitable for directed graphs. Graphs with 100,000 vertices and 2,500,000 edges can be clustered in 1 hour (3GHz).

19 Dualising an NN k Network v w To speed up computation, consider only the r neighbours of vertex w with greatest weights. The dual of a graph with n vertices has at most rn vertices and r 2 n edges. x y z vw wx wy wz

20 NN k Networks: Clustering 32,000 images, r = 5, I = 20, 8 features

21

22 Cluster Validation With known classes, we may compare distribution of objects across clusters and classes Here, objects are assigned to several clusters and we don’t know the classes However, for Corel dataset we have rich image annotation available Assumption: good clusters tend to have a smaller joint vocabulary (description size) We can compare the observed description size with what we would expect if images were randomly assigned to clusters classes 2054 291 1212 clusters

23 Cluster Validation What is the sampling distribution under the null hypothesis that images in a cluster are a random subset of the collection? If the number of terms is large and the clusters small, the description size is a sum of i.i.d. variates. By the Central Limit Theorem, we expect the sampling distribution to look Gaussian.

24 Cluster Validation For each cluster size, we generate a sampling distribution by recording the description size of 10,000 random subsets of that size The observed description size of a cluster can then be compared with that sampling distribution p-Value: probability that, given the null hypothesis is true, the description size is at least as low as the observed one

25

26

27

28

29 Conclusions NN k Networks as a representation of semantic relationships between images NN k Networks support fast interactive search but O(n 2 ) to construct Structure can be extracted by partitioning the dual of the graph. Even though each vertex’s neighbours form a diverse set, the subgraphs tend to be very uniform

30 Future Directions Scalability: Google provides access to > 1,3 Billion images Update: can we add an image without having to construct the whole network? Global features work well, greater discrimination could be achieved through domain-specific detectors (e.g. faces, background) Clustering: how does MCL fare in comparison with other techniques, e.g. spectral graph clustering?


Download ppt "NN k Networks for browsing and clustering image collections Daniel Heesch Communications and Signal Processing Group Electrical and Electronic Engineering."

Similar presentations


Ads by Google