PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL 2011-11709 Seo Seok Jun.

Slides:



Advertisements
Similar presentations
Relevance Feedback A relevance feedback mechanism for content- based image retrieval G. Ciocca, R. Schettini 1999.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
ECG Signal processing (2)
Imbalanced data David Kauchak CS 451 – Fall 2013.
Evaluation of Decision Forests on Text Categorization
Query Chains: Learning to Rank from Implicit Feedback Paper Authors: Filip Radlinski Thorsten Joachims Presented By: Steven Carr.
Learning Techniques for Video Shot Detection Under the guidance of Prof. Sharat Chandran by M. Nithya.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Machine learning continued Image source:
Relevance Feedback Content-Based Image Retrieval Using Query Distribution Estimation Based on Maximum Entropy Principle Irwin King and Zhong Jin Nov
Content-based Video Indexing, Classification & Retrieval Presented by HOI, Chu Hong Nov. 27, 2002.
Robust Moving Object Detection & Categorization using self- improving classifiers Omar Javed, Saad Ali & Mubarak Shah.
Evaluating Search Engine
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Image Search Presented by: Samantha Mahindrakar Diti Gandhi.
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Presentation in IJCNN 2004 Biased Support Vector Machine for Relevance Feedback in Image Retrieval Hoi, Chu-Hong Steven Department of Computer Science.
Dept. of Computer Science & Engineering, CUHK Pseudo Relevance Feedback with Biased Support Vector Machine in Multimedia Retrieval Steven C.H. Hoi 14-Oct,
Presented by Zeehasham Rasheed
ICME 2004 Tzvetanka I. Ianeva Arjen P. de Vries Thijs Westerveld A Dynamic Probabilistic Multimedia Retrieval Model.
Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.
An investigation of query expansion terms Gheorghe Muresan Rutgers University, School of Communication, Information and Library Science 4 Huntington St.,
The Relevance Model  A distribution over terms, given information need I, (Lavrenko and Croft 2001). For term r, P(I) can be dropped w/o affecting the.
Optimizing Learning with SVM Constraint for Content-based Image Retrieval* Steven C.H. Hoi 1th March, 2004 *Note: The copyright of the presentation material.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
Information Retrieval in Practice
Content-Based Video Retrieval System Presented by: Edmund Liang CSE 8337: Information Retrieval.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Active Learning for Class Imbalance Problem
Text- and Content-based Approaches to Image Retrieval for the ImageCLEF 2009 Medical Retrieval Track Matthew Simpson, Md Mahmudur Rahman, Dina Demner-Fushman,
Multimedia Databases (MMDB)
Smart RSS Aggregator A text classification problem Alban Scholer & Markus Kirsten 2005.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
Content-Based Image Retrieval
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Improving Web Spam Classification using Rank-time Features September 25, 2008 TaeSeob,Yun KAIST DATABASE & MULTIMEDIA LAB.
Finding Better Answers in Video Using Pseudo Relevance Feedback Informedia Project Carnegie Mellon University Carnegie Mellon Question Answering from Errorful.
Classifying Images with Visual/Textual Cues By Steven Kappes and Yan Cao.
Processing of large document collections Part 3 (Evaluation of text classifiers, term selection) Helena Ahonen-Myka Spring 2006.
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Chapter 4: Pattern Recognition. Classification is a process that assigns a label to an object according to some representation of the object’s properties.
CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
Bing LiuCS Department, UIC1 Chapter 8: Semi-supervised learning.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Jen-Tzung Chien, Meng-Sung Wu Minimum Rank Error Language Modeling.
1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
1 CS 430 / INFO 430 Information Retrieval Lecture 17 Metadata 4.
Carnegie Mellon School of Computer Science Language Technologies Institute CMU Team-1 in TDT 2004 Workshop 1 CMU TEAM-A in TDT 2004 Topic Tracking Yiming.
11 A Classification-based Approach to Question Routing in Community Question Answering Tom Chao Zhou 1, Michael R. Lyu 1, Irwin King 1,2 1 The Chinese.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Relevance Feedback in Image Retrieval System: A Survey Tao Huang Lin Luo Chengcui Zhang.
Max-Confidence Boosting With Uncertainty for Visual tracking WEN GUO, LIANGLIANG CAO, TONY X. HAN, SHUICHENG YAN AND CHANGSHENG XU IEEE TRANSACTIONS ON.
Adaboost (Adaptive boosting) Jo Yeong-Jun Schapire, Robert E., and Yoram Singer. "Improved boosting algorithms using confidence- rated predictions."
University Of Seoul Ubiquitous Sensor Network Lab Query Dependent Pseudo-Relevance Feedback based on Wikipedia 전자전기컴퓨터공학 부 USN 연구실 G
Visual Information Retrieval
Large-Scale Content-Based Audio Retrieval from Text Queries
Image Segmentation Techniques
Retrieval Performance Evaluation - Measures
Presentation transcript:

PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun

Abstract Video information retrieval ◦ Finding info. relevant to query Approach ◦ Pseudo-relevance feedback ◦ Negative PRF

Questions How this paper approach to content- based video retrieval What is the advantage of negative PRF What this paper do to remove extreme outliers

Introduction Content-based access to video info. CBVR ◦ Allow users to query and retrieve based on audio and video ◦ Limite  capturing fairly low-level physical features  Color, texture, shape, …  Difficult to determine similarity metrics  diff. query scenario -> diff. similarity metrics  Animals -> by shape  Sky, water -> by color

Introduction ◦ Making the similarity metric adaptive Adapting similarity metric ◦ Automatically discover the discriminating feature subspace ◦ How?  Cast as classification problem  Margin-based classifier  SVMs, Adaboosting  High performance  Learning the maximal margin hyperplane  Users’ query only provides a small positive data with no explicit negative data at all

Introduction ◦ Thus, to use, more training data needed  Negative examples  Random sampling  As positive data # in a collection is very small  Risk: positive examples might be included as negative  In standard relevance feedback  Ask user to label  Tedious!  Automatic retrieval is essential!

Introduction  Automatic relevance feedback  Based on not tailored to specific queries  Negative feedback -> sample the bottom-ranked examples  Ex) car -> different from query images in “shape”  Feedback negative data  re-weight  Refine discriminating feature subspace  Learning algorithm would be better than universal similarity metric(used in all query)

Introduction Learning process ◦ Purpose  Discover a better similarity metric  Finding the most discriminating subspace between positive and negative examples. ◦ Cannot produce fully accurate classification  Training data is too small ◦ Negative distribution -> not reliable! ◦ Risk! -> feedback from incorrect estimate ◦ Combining! (with generic similarity metric)

Related work Briefly discuss some of the features of complete system ◦ The Informedia Digital Video Library ◦ Relevance and Pseudo-Relevance Feedback

Pseudo-Relevance Feedback Similar to relevance feedback ◦ Both oriented from document retrieval ◦ Without any user intervention ◦ Few study in multimedia retrieval yet  No longer can assume top ranked are always relevant  Relatively poor performance of visual retrieval

Pseudo-Relevance Feedback Positive example based learning ◦ Partially supervised learning ◦ Begin with a small # of positive examples ◦ No negative examples ◦ Goal: associate all examples in collection with one of the given categories  Out goal?  Producing a ranked list of the examples

Pseudo-Relevance Feedback Semi-supervised learning ◦ Two classifier ◦ Training set of labeled data ◦ Working set of unlabeled data Transductive learning ◦ Paradigms to utilize the info. of unlabeled data ◦ Successful in image retrieval ◦ Computation is too expensive  Multimedia -> large collection

Pseudo-Relevance Feedback Query: text + audio + image/video Retrieving a set of relevant video shot ◦ Permutation of the video shots ◦ Sorted by their similarity  Difference(two video segments) -> similarity metric ◦ Video feature  Multiple perspective  Speech transcript, audio, camera motion, video frame

Pseudo-Relevance Feedback Retrieval as classification problem ◦ Data collection can be separated into pos/neg ◦ Mean average precision  Precision and recall is common measure  But not taking the rank into consideration  Area under an ideal recall/precision curve

Pseudo-Relevance Feedback PRF ◦ Users’ judgment -> output of a base similarity metric ◦ f b : base similarity metric ◦ p: sampling strategy ◦ f l : learning algorithm ◦ g: combination strategy

Pseudo-Relevance Feedback

Algorithm Details

Pseudo-Relevance Feedback Positive example ◦ Query examples Negative example ◦ Strongest negative examples Feedback only one time ◦ Computational issue Automatically feedback the training data based on generic similarity metric ◦ To learn adaptive similarity metric ◦ Generalize the discriminating subspace for various queries

Pseudo-Relevance Feedback Why good? ◦ Good generalization ability of margin-based learning algorithm Isotropic data distribution -> invalid ◦ Directions vary with different queries, topics  Sky -> color  Car -> shape ◦ In this case, PRF provide better similar metric than generic.

Pseudo-Relevance Feedback Test two case ◦ Positive data  Along the edge of the data collection  Center of the data collection ◦ Both case  PRF superior  Base similarity metric: generic metric  Cannot be modified across query

Pseudo-Relevance Feedback

PRF metric can be adapted based on the global data distribution and training data ◦ By feeding back the negative examples ◦ Near optimal decision boundary Associate higher score ◦ Farther away from the negative data ◦ Good when positive data are near the margin  Common in high dimensional spaces

Pseudo-Relevance Feedback Downside ◦ Some neg. outlier assigned a higher score than any positive data -> more false alarm ◦ Solution  Combining base metric and PRF metric  Smooth out most of the outlier  Just simple linear combination(1:1)  Reasonable trade-off between local classification behavior and global discriminating ability

Experiment Video: TREC Video Retrieval Track Text: NIST ◦ 40 hours of MPEG-1 video Audio: splits the audio from the video ◦ Down-samples to 16cKz, 16 bit sample Speech recognition system ◦ Broadcast news transcript Image processing side ◦ Low-level image features; color and texture ◦ Query as xml

Experiment

Results

Results

Results

Results

results

conclusion Classification task Machine learning theory to video retrieval SVMs learn to weight the discriminating features Negative PRF ◦ Separate the means of distributions of the neg. and pos. examples Smoothing with combination