Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.

Similar presentations


Presentation on theme: "ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model."— Presentation transcript:

1 ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model

2 ICME 2004 Introduction Video Representation schemes used for retrieval: –Static –Spatio-temporal Video is a temporal media so a ‘good’ model solves the limitations of keyframe-based shot representation

3 ICME 2004 Spatio-temporal grouping Spatial priority and tracking of regions from frame to frame Joint spatial and temporal segmentation –Human vision finds salient structures jointly in space and time (Gepshtein and Kubovy, 2000)

4 ICME 2004 Motivation Pursue video retrieval instead of image (keyframe) retrieval Extension of the Static Probabilistic Multimedia Retrieval model (2003) GMM in DCT-space-time domain –Diagonal covariance

5 ICME 2004 Static Model DocsModels Indexing - Estimate Gaussian Mixture Models from images using EM - Based on feature vector with colour, texture and position information from pixel blocks - Fixed number of components

6 ICME 2004 Static Model Indexing –Estimate a Gaussian Mixture Model from each keyframe (using EM) –Fixed number of components (C=8) –Feature vectors contain colour, texture, and position information from pixel blocks:

7 ICME 2004 Static Model Models P(Q|M 1 ) P(Q|M 4 ) P(Q|M 3 ) P(Q|M 2 ) Query Retrieval –Calculate conditional probabilities of query samples given models in collection

8 ICME 2004 Dynamic Model Selecting frames – 1 second sequence around the keyframe – Entire video shot as sequence of frames sampled at regular intervals Features

9 ICME 2004 Dynamic Model Indexing: GMM of multiple frames around keyframe Feature vectors extended with time- stamp normalized in [0,1]: 0.5 1

10 ICME 2004 Dynamic Model

11 ICME 2004 Query example: A single image Artificial sequence of 29 images as the single query example where the time is normalized between 0 and 1 Extend the query example image’s features with a fixed temporal feature value of 0.5 – Better results and lower computational cost

12 ICME 2004 Dynamic Model Advantages More training data for models –Less sensitive to random initialization Reduced dependency upon selecting appropriate keyframe Some spatio-temporal aspects of shot are captured –(Dis-)appearance of objects

13 ICME 2004 Dynamic Model

14 ICME 2004 Dynamic Model

15 ICME 2004 Dynamic Model

16 ICME 2004 Retrieval Framework Smoothing Building dynamic GMMs Likelihood goes to infinity ???

17 ICME 2004 Experimental Set-up Build models for each shot –Static, Dynamic, Language Build Queries from topics –Construct simple keyword text query –Select visual example –Rescale and compress example images to match video size and quality

18 ICME 2004 Combining Modalities Independence assumption textual/visual –P(Q t,Q v |Shot) = P(Q t |LM) * P(Q v |GMM) Combination works if both runs useful [CWI:TREC:2002] Dynamic run more useful than static run RunMAP ASR only.130 Static only.022 Static+ASR.105 Dynamic only.022 Dynamic+ASR.132

19 ICME 2004 Combining Modalities Dynamic: Higher Initial Precision

20 ICME 2004 Dynamic: Higher initial precision Static run Dynamic run

21 ICME 2004 Dow Jones Topic (120)

22 ICME 2004 Dow Jones Topic (120) “Dow Jones Industrial Average rise day points” + =

23 ICME 2004 Conclusions Dynamic model captures visual similarity better –Spatio-temporal aspects –More training data –Apropriate key-frame less critical –Less sensitive to the random initialization ASR + dynamic better than either alone

24 ICME 2004 Future work More data needs more computation effort – optimizations ? Avoid the singular solutions Dynamic number of components ? Full covariance in space-time Integration of audio

25 ICME 2004 Thanks !!!

26 ICME 2004 Merging Run Results Combining (conflicting) examples difficult [CWI:TREC:2002] Single example  Miss relevant shots Round-Robin Merging 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Combined 1 2 3 4.

27 ICME 2004 Merging Run Results

28 ICME 2004 Merging Run Results Combining (conflicting) examples difficult [CWI:TREC:2002] Single example  Miss relevant shots Round-Robin Merging Combined 1 2 3 4. 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 +ASR Single.022.132 All.031.149 Selected.039.151 Best.050.155

29 ICME 2004 Conclusions Visual aspects of an information need are best captured by using multiple examples Combining results for multiple (good) examples in round-robin fashion, each ranked on both modalities, gives near- best performance for almost all topics


Download ppt "ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model."

Similar presentations


Ads by Google