PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun.

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun

Abstract Video information retrieval ◦ Finding info. relevant to query Approach ◦ Pseudo-relevance feedback ◦ Negative PRF

Questions How this paper approach to content- based video retrieval What is the advantage of negative PRF What this paper do to remove extreme outliers

Introduction Content-based access to video info. CBVR ◦ Allow users to query and retrieve based on audio and video ◦ Limite  capturing fairly low-level physical features  Color, texture, shape, …  Difficult to determine similarity metrics  diff. query scenario -> diff. similarity metrics  Animals -> by shape  Sky, water -> by color

Introduction ◦ Making the similarity metric adaptive Adapting similarity metric ◦ Automatically discover the discriminating feature subspace ◦ How?  Cast as classification problem  Margin-based classifier  SVMs, Adaboosting  High performance  Learning the maximal margin hyperplane  Users’ query only provides a small positive data with no explicit negative data at all

Introduction ◦ Thus, to use, more training data needed  Negative examples  Random sampling  As positive data # in a collection is very small  Risk: positive examples might be included as negative  In standard relevance feedback  Ask user to label  Tedious!  Automatic retrieval is essential!

Introduction  Automatic relevance feedback  Based on not tailored to specific queries  Negative feedback -> sample the bottom-ranked examples  Ex) car -> different from query images in “shape”  Feedback negative data  re-weight  Refine discriminating feature subspace  Learning algorithm would be better than universal similarity metric(used in all query)

Introduction Learning process ◦ Purpose  Discover a better similarity metric  Finding the most discriminating subspace between positive and negative examples. ◦ Cannot produce fully accurate classification  Training data is too small ◦ Negative distribution -> not reliable! ◦ Risk! -> feedback from incorrect estimate ◦ Combining! (with generic similarity metric)

Related work Briefly discuss some of the features of complete system ◦ The Informedia Digital Video Library ◦ Relevance and Pseudo-Relevance Feedback

Pseudo-Relevance Feedback Similar to relevance feedback ◦ Both oriented from document retrieval ◦ Without any user intervention ◦ Few study in multimedia retrieval yet  No longer can assume top ranked are always relevant  Relatively poor performance of visual retrieval

Pseudo-Relevance Feedback Positive example based learning ◦ Partially supervised learning ◦ Begin with a small # of positive examples ◦ No negative examples ◦ Goal: associate all examples in collection with one of the given categories  Out goal?  Producing a ranked list of the examples

Pseudo-Relevance Feedback Semi-supervised learning ◦ Two classifier ◦ Training set of labeled data ◦ Working set of unlabeled data Transductive learning ◦ Paradigms to utilize the info. of unlabeled data ◦ Successful in image retrieval ◦ Computation is too expensive  Multimedia -> large collection

Pseudo-Relevance Feedback Query: text + audio + image/video Retrieving a set of relevant video shot ◦ Permutation of the video shots ◦ Sorted by their similarity  Difference(two video segments) -> similarity metric ◦ Video feature  Multiple perspective  Speech transcript, audio, camera motion, video frame

Pseudo-Relevance Feedback Retrieval as classification problem ◦ Data collection can be separated into pos/neg ◦ Mean average precision  Precision and recall is common measure  But not taking the rank into consideration  Area under an ideal recall/precision curve

Pseudo-Relevance Feedback PRF ◦ Users’ judgment -> output of a base similarity metric ◦ f b : base similarity metric ◦ p: sampling strategy ◦ f l : learning algorithm ◦ g: combination strategy

Pseudo-Relevance Feedback

Algorithm Details

Pseudo-Relevance Feedback Positive example ◦ Query examples Negative example ◦ Strongest negative examples Feedback only one time ◦ Computational issue Automatically feedback the training data based on generic similarity metric ◦ To learn adaptive similarity metric ◦ Generalize the discriminating subspace for various queries

Pseudo-Relevance Feedback Why good? ◦ Good generalization ability of margin-based learning algorithm Isotropic data distribution -> invalid ◦ Directions vary with different queries, topics  Sky -> color  Car -> shape ◦ In this case, PRF provide better similar metric than generic.

Pseudo-Relevance Feedback Test two case ◦ Positive data  Along the edge of the data collection  Center of the data collection ◦ Both case  PRF superior  Base similarity metric: generic metric  Cannot be modified across query

Pseudo-Relevance Feedback

PRF metric can be adapted based on the global data distribution and training data ◦ By feeding back the negative examples ◦ Near optimal decision boundary Associate higher score ◦ Farther away from the negative data ◦ Good when positive data are near the margin  Common in high dimensional spaces

Pseudo-Relevance Feedback Downside ◦ Some neg. outlier assigned a higher score than any positive data -> more false alarm ◦ Solution  Combining base metric and PRF metric  Smooth out most of the outlier  Just simple linear combination(1:1)  Reasonable trade-off between local classification behavior and global discriminating ability

Experiment Video: TREC Video Retrieval Track Text: NIST ◦ 40 hours of MPEG-1 video Audio: splits the audio from the video ◦ Down-samples to 16cKz, 16 bit sample Speech recognition system ◦ Broadcast news transcript Image processing side ◦ Low-level image features; color and texture ◦ Query as xml

Experiment

Results

results

conclusion Classification task Machine learning theory to video retrieval SVMs learn to weight the discriminating features Negative PRF ◦ Separate the means of distributions of the neg. and pos. examples Smoothing with combination

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun.

Similar presentations

Presentation on theme: "PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun.

Similar presentations

Presentation on theme: "PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun."— Presentation transcript:

Similar presentations

About project

Feedback