Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICIP 2004, Singapore, October 25-27 A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries.

Similar presentations


Presentation on theme: "ICIP 2004, Singapore, October 25-27 A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries."— Presentation transcript:

1 ICIP 2004, Singapore, October 25-27 A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries and Thijs Westerveld

2 ICIP 2004, Singapore, October 25-27 Theory

3 ICIP 2004, Singapore, October 25-27 Generative Models… A statistical model for generating data –Probability distribution over samples in a given ‘language’ M P ( | M )= P ( | M ) P ( | M, ) © Victor Lavrenko, Aug. 2002 aka ‘Language Modelling’

4 ICIP 2004, Singapore, October 25-27 Basic question: –What is the likelihood that this document is relevant to this query? P(rel|I,Q) = P(I,Q|rel)P(rel) / P(I,Q) … in Information Retrieval P(I,Q|rel) = P(Q|I,rel)P(I|rel)

5 ICIP 2004, Singapore, October 25-27 Retrieval (Query generation) Models P(Q|M 1 ) P(Q|M 4 ) P(Q|M 3 ) P(Q|M 2 ) Query Docs

6 ICIP 2004, Singapore, October 25-27 ‘Language Modeling’ Not just ‘English’ But also, the language of –author –newspaper –text document –image Shakespeare or Dickens? Indeed the short and the long. Marry, ‘tis a noble Lepidus.

7 ICIP 2004, Singapore, October 25-27 ‘Language Modeling’ Guardian or Times?Not just ‘English’ But also, the language of –author –newspaper –text document –image

8 ICIP 2004, Singapore, October 25-27 ‘Language Modeling’ or ?Not just English! But also, the language of –author –newspaper –text document –image

9 ICIP 2004, Singapore, October 25-27 The Fundamental Problem Usually, we don’t know the model M –But have a sample representative of that model First estimate a model from a sample Then compute the observation probability P ( | M ( ) ) M © Victor Lavrenko, Aug. 2002

10 ICIP 2004, Singapore, October 25-27 Urn metaphor Unigram Language Models © Victor Lavrenko, Aug. 2002 P( | ) ~ P ( | ) P ( | ) P ( | ) P ( | ) = 4/9 * 2/9 * 4/9 * 3/9

11 ICIP 2004, Singapore, October 25-27 The Zero-frequency Problem Suppose some event not in our example –Model may assign zero probability to that event –And to any set of events involving the unseen event ?

12 ICIP 2004, Singapore, October 25-27 Smoothing Idea: shift part of probability mass to unseen events Interpolation with background model –Reflects expected frequency of events –Plays role of IDF (inverse document freq.) – +(1- )

13 ICIP 2004, Singapore, October 25-27 The IDF Role of Smoothing P(x| ) +(1- ) P(x|) P(x| ) = +1 (1- ) P(x|) –Ranking independent of

14 ICIP 2004, Singapore, October 25-27 Practise

15 ICIP 2004, Singapore, October 25-27 Pixel level: no semantics Pixel blocks/regions Image Retrieval

16 ICIP 2004, Singapore, October 25-27 Modelling Images Compute local features –Eg., blueness and yellowness 0.2567 0.3294 0.1334 0.1664 0.3125 0.3714 0.3288 0.4624 0.1854 0.2308...

17 ICIP 2004, Singapore, October 25-27

18 Discrete Model yellow blue

19 ICIP 2004, Singapore, October 25-27 Discrete Model

20 ICIP 2004, Singapore, October 25-27 Modelling Images blue yellow Histogram also models empty regions in the feature space Boundaries are hard

21 ICIP 2004, Singapore, October 25-27 Continuous Model Build Gaussian Mixture model using expectation maximisation (EM) 2 Components –Centers, covariance –Random intialisation blue yellow

22 ICIP 2004, Singapore, October 25-27 Continuous Model

23 ICIP 2004, Singapore, October 25-27 Discrete vs. Continuous Discrete Model –Low indexing cost (binning) –Low retrieval cost (inverted file) –But… how to partition the indexing space? Continuous Model –High indexing cost (EM algorithm) –High retrieval cost (access all data) –But… less overfitting  better generalisation

24 ICIP 2004, Singapore, October 25-27 Experiments TRECVID2003 search task –Discrete vs. Continuous –Regions vs. full Query examples –All examples vs. designated only Mean average precision

25 ICIP 2004, Singapore, October 25-27 Results Continuous Model significantly better on almost all queries However, Discrete Model significantly better for small number of highly focused queries (e.g., flames, airplane taking off) –More analysis needed though

26 ICIP 2004, Singapore, October 25-27 Conclusions Language modelling approach to IR also applicable to retrieval of other media Discrete vs. Continuous Model –Continuous Model almost always better –Unfortunately, Discrete Model far easier to implement efficiently

27 ICIP 2004, Singapore, October 25-27 Future Work Improve Sampling Process –Better texture representation? –Overlapping, multi-scale image patches? Improve Discrete Model –Partitioning of feature space in grid cells Compare the performance of the two models in interactive setting with relevance feedback –Higher quality per iteration vs. many iterations


Download ppt "ICIP 2004, Singapore, October 25-27 A Comparison of Continuous vs. Discrete Image Models for Probabilistic Image and Video Retrieval Arjen P. de Vries."

Similar presentations


Ads by Google