Coached Active Learning for Interactive Video Search Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University,

Slides:



Advertisements
Similar presentations
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Random Forest Predrag Radenković 3237/10
Patch to the Future: Unsupervised Visual Prediction
Optimal Design Laboratory | University of Michigan, Ann Arbor 2011 Design Preference Elicitation Using Efficient Global Optimization Yi Ren Panos Y. Papalambros.
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Daozheng Chen 1, Mustafa Bilgic 2, Lise Getoor 1, David Jacobs 1, Lilyana Mihalkova 1, Tom Yeh 1 1 Department of Computer Science, University of Maryland,
Context-Aware Query Classification Huanhuan Cao 1, Derek Hao Hu 2, Dou Shen 3, Daxin Jiang 4, Jian-Tao Sun 4, Enhong Chen 1 and Qiang Yang 2 1 University.
Model Personalization (1) : Data Fusion Improve frame and answer (of persistent query) generation through Data Fusion (local fusion on personal and topical.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
ACM Multimedia th Annual Conference, October , 2004
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Relevance Feedback based on Parameter Estimation of Target Distribution K. C. Sia and Irwin King Department of Computer Science & Engineering The Chinese.
1 Integrating User Feedback Log into Relevance Feedback by Coupled SVM for Content-Based Image Retrieval 9-April, 2005 Steven C. H. Hoi *, Michael R. Lyu.
A Technique for Advanced Dynamic Integration of Multiple Classifiers Alexey Tsymbal*, Seppo Puuronen**, Vagan Terziyan* *Department of Artificial Intelligence.
Presented by Zeehasham Rasheed
Active Learning Strategies for Drug Screening 1. Introduction At the intersection of drug discovery and experimental design, active learning algorithms.
Exploration & Exploitation in Adaptive Filtering Based on Bayesian Active Learning Yi Zhang, Jamie Callan Carnegie Mellon Univ. Wei Xu NEC Lab America.
Active Learning Strategies for Compound Screening Megon Walker 1 and Simon Kasif 1,2 1 Bioinformatics Program, Boston University 2 Department of Biomedical.
Hierarchical Subquery Evaluation for Active Learning on a Graph Oisin Mac Aodha, Neill Campbell, Jan Kautz, Gabriel Brostow CVPR 2014 University College.
Ordinal Decision Trees Qinghua Hu Harbin Institute of Technology
Remote Sensing Laboratory Dept. of Information Engineering and Computer Science University of Trento Via Sommarive, 14, I Povo, Trento, Italy Remote.
A Study of Computational and Human Strategies in Revelation Games 1 Noam Peled, 2 Kobi Gal, 1 Sarit Kraus 1 Bar-Ilan university, Israel. 2 Ben-Gurion university,
Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.
1 Efficiently Learning the Accuracy of Labeling Sources for Selective Sampling by Pinar Donmez, Jaime Carbonell, Jeff Schneider School of Computer Science,
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Utilizing Video Ontology for Fast and Accurate Query-by-Example Retrieval Kimiaki Shirahama Graduate School of Economics, Kobe University Kuniaki Uehara.
Search Engines and Information Retrieval Chapter 1.
Object Bank Presenter : Liu Changyu Advisor : Prof. Alex Hauptmann Interest : Multimedia Analysis April 4 th, 2013.
Reinforcement Learning
 An important problem in sponsored search advertising is keyword generation, which bridges the gap between the keywords bidded by advertisers and queried.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Cristina Conati Department of Computer Science University of British Columbia Plan Recognition for User-Adaptive Interaction.
ON INCENTIVE-BASED TAGGING Xuan S. Yang, Reynold Cheng, Luyi Mo, Ben Kao, David W. Cheung {xyang2, ckcheng, lymo, kao, The University.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
WEB SEARCH PERSONALIZATION WITH ONTOLOGICAL USER PROFILES Data Mining Lab XUAN MAN.
Universit at Dortmund, LS VIII
Context-Sensitive Information Retrieval Using Implicit Feedback Xuehua Shen : department of Computer Science University of Illinois at Urbana-Champaign.
Transfer Learning Motivation and Types Functional Transfer Learning Representational Transfer Learning References.
LATENT SEMANTIC INDEXING Hande Zırtıloğlu Levent Altunyurt.
Today Ensemble Methods. Recap of the course. Classifier Fusion
Greedy is not Enough: An Efficient Batch Mode Active Learning Algorithm Chen, Yi-wen( 陳憶文 ) Graduate Institute of Computer Science & Information Engineering.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
Information Retrieval Lecture 6 Introduction to Information Retrieval (Manning et al. 2007) Chapter 16 For the MSc Computer Science Programme Dell Zhang.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Paired Sampling in Density-Sensitive Active Learning Pinar Donmez joint work with Jaime G. Carbonell Language Technologies Institute School of Computer.
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Ensemble Methods in Machine Learning
Post-Ranking query suggestion by diversifying search Chao Wang.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Copyright Paula Matuszek Kinds of Machine Learning.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Multimedia Analytics Jianping Fan Department of Computer Science University of North Carolina at Charlotte.
Collaborative Filtering via Euclidean Embedding M. Khoshneshin and W. Street Proc. of ACM RecSys, pp , 2010.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Similarity Measurement and Detection of Video Sequences Chu-Hong HOI Supervisor: Prof. Michael R. LYU Marker: Prof. Yiu Sang MOON 25 April, 2003 Dept.
Bayesian Optimization. Problem Formulation Goal  Discover the X that maximizes Y  Global optimization Active experimentation  We can choose which values.
Project Implementation for ITCS4122
Personalizing Search on Shared Devices
CIKM Competition 2014 Second Place Solution
CIKM Competition 2014 Second Place Solution
Ryen White, Ahmed Hassan, Adish Singla, Eric Horvitz
Nearest Neighbors CSC 576: Data Mining.
Presentation transcript:

Coached Active Learning for Interactive Video Search Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University, China

Coached Active Learning for Interactive Video Search Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University, China

Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University, China Coached Active Learning for Interactive Video Search

Xiao-Yong Wei, Zhen-Qun Yang Machine Intelligence Laboratory College of Computer Science Sichuan University, China

Search “car on road” with Google

Similar to the Automatic Search task in TRECVID

Relevance Feedback Interactive Search

Relevance Feedback Interactive Search Query Modeling, i.e., to figure out what the searcher wants?

Relevance Feedback Interactive Search Query Modeling, i.e., what the searcher wants? Querying Strategy, i.e., how to keep the searcher’s patience on labeling?

Query Modeling is indeed a task of Space Exploration

Search & Space Exploration

Feature Space with Unlabeled Instances

Space Exploration for Query Modeling How to explore effectively? Querying Strategy: which instances to query next (i.e., being presented to the searcher for labeling)? Query Distribution

The Goal of Query Modeling in Terms of Space Exploration A reasonable querying strategy –To find the Query Distribution (i.e., the distribution of the relevant instances) ASAP –To satisfy Searchers with Relevant Samples ASAP

Active Learning with Uncertainty Sampling – A Popular and Effective Strategy

Active Learning with Uncertainty Sampling The 1 st Round – Training Classifier

Active Learning with Uncertainty Sampling The 1 st Round – Query the Searcher

Active Learning with Uncertainty Sampling The 2 nd Round – Training Classifier

Active Learning with Uncertainty Sampling The 2 nd Round – Query the Searcher

Active Learning with Uncertainty Sampling The 3 rd Round

Active Learning with Uncertainty Sampling The 4 th Round

Active Learning with Uncertainty Sampling The 5 th Round

Active Learning with Uncertainty Sampling

Active Learning with Uncertainty Sampling may not be a “kindred soul” with Video Search

The spirit of Active Learning (to reduce labeling efforts) The goal of Query Modeling (to harvest relevant instances) Also found by A.G. Hauptmann and et al. [MM’06] Active Learning with Uncertainty Sampling may not be a “kindred soul” with Video Search

The Dilemma of Exploration and Exploitation Exploration: to explore more unknown area and get more about the Query Distribution and to boost future gains Exploitation: to harvest more relevant instances and to obtain immediate rewards

Our Idea – The Query Distribution, even Unknown, is Predictable

The Idea of Coached Active Learning Estimate the Query Distribution after each round to avoid the risk of learning on a completely unknown distribution

The Idea of Coached Active Learning Estimate the Query Distribution pdf of the underlying Query Distribution

The Idea of Coached Active Learning Estimate the Query Distribution Query users with instances picked from both dense and uncertain areas of the distribution Balance the proportion of the two types of instances

The Idea of Coached Active Learning Modeling the Query Distribution Balancing the Exploration and Exploitation

Modeling the Query Distribution

L+ : An incomplete sampling of the Query Distribution ? ? ? ? ? ? ? ? ?

Modeling the Query Distribution An indirect way of estimating the pdf of the Query Distribution –Find training examples which are statistically from the same distribution as L+ –Use the pdf(s) of those training examples to estimate that of the query ? ? ? ? ? ? ? ? ?

Modeling the Query Distribution A practical implementation –Training: organize training examples into nonexclusive semantic groups –Testing: check L+ with each group, see whether they are statistically from the same distribution using two-sample Hotelling T-Square Test –Testing: estimate Query Distribution with GMM ? ? ? ? ? ? ? ? ?

Modeling the Query Distribution Rationale of the implementation –Samples from the same distribution may carry similar semantics –Semantic groups (which passed the test) then reflect the searcher’s query intention

Modeling the Query Distribution

Distribution of the semantic group “road”

Modeling the Query Distribution Distribution of the semantic group “road” Distribution of the semantic group “car” Distribution of the query “car on road”

Balancing the Exploration and Exploitation

The priority of selecting an instance x to query next (i.e., present to the searcher for labeling) Exploitative PriorityExplorative Priority Balancing Factor

Harvest(x): Exploitative Priority How likely the exploitation will be boosted when x is selected to query next Current Decision Boundary

How likely the exploitation will be boosted when x is selected to query next Current Decision Boundary Harvest(x): Exploitative Priority

How likely the exploitation will be boosted when x is selected to query next Harvest(x): Exploitative Priority

How likely the exploitation will be boosted when x is selected to query next Harvest(x): Exploitative Priority

Explore(x): Explorative Priority How likely the exploration will be improved when x is selected to query next Entropy of the prior distribution Entropy of the posterior distribution

How likely the exploration will be improved when x is selected to query next Explore(x): Explorative Priority

λ: Balancing Factor Updated by investigating the Harvest History, i.e., numbers of relevant instances found in previous rounds The expected harvest

Experimental Results

Experiments Learning pdf(s) of the semantic groups –Large Scale Concept Ontology for Multimedia (LSCOM), 449 concept labeled on TRECVID 2005 development dataset –Using concept combination to create groups –Results: 23,064 groups and their pdfs (

Experiments Comparison with Uncertainty Sampling (US) –12 volunteers (age 19~26, 3 girls and 9 boys) –TREVID 2005 test dataset (45,765 video shots in English/Chinese/Arabic) –24 TRECVID queries and ground truth –Mean Average Precision (MAP) –Concept-based search for the initial round –Interface

Experiments Comparison with Uncertainty Sampling (US) –12 volunteers (age 19~26, 3 girls and 9 boys) –TREVID test dataset (45,765 video shots in English/Chinese/Arabic) –24 TRECVID queries and ground truth –Mean Average Precision (MAP) –Interface

Experiments Comparison with Uncertainty Sampling (US)

Experiments Comparison with Uncertainty Sampling (US) Harvest the shots by initial search

Experiments Comparison with Uncertainty Sampling (US) Harvest the shots in the most dense area

Experiments Comparison with Uncertainty Sampling (US) Drop down little a bit to encourage exploration and then remains stable to harvest the newly found relevant shots

Experiments Comparison with Uncertainty Sampling (US) Most relevant shots have been harvested, more and more emphasis on exploration

Experiments Comparison with Uncertainty Sampling (US) –Purely explorative US causes early give-up Most shots have been harvested, more and more emphasis on exploration “difficult” queries

Experiments Study of user patience on labeling

Experiments Can GMM stimulate the query distribution well? Query “the road with one more cars”

Experiments Comparison with the state-of-the-art –CAL (with virtual searchers) vs. The best results reported in TRECVID 06-09

Conclusions

Advantages of CAL –Predictable Query Distribution –Fast Convergence –Balance between Exploitation and Exploration in a principled way Toys and demos available at

Conclusions Future work –Replace the Harvest() and Explore() with more advanced ones –Try it on larger datasets or web videos

Advantages of CAL Predictable Query Distribution Fast Convergence Uncertainty Sampling CAL, the 1 st round

Thanks !