Download presentation
Presentation is loading. Please wait.
1
PicHunter A Bayesian Image Retrieval System
Ingemar Cox (1,2,3,4) T. Conway (3) Joumana Ghosn (2,3) Matt Miller (1,2,3,4) Thomas Minka (3,4) Steve Omohundro (1) Thomas Papathomas (2,3) Peter N. Yianilos (1,2,3,4)
2
Project overview Target Testing and the PicHunter Bayesian Multimedia Retrieval System, I.J. Cox, Matt Miller, S.M. Omohundo, P.N. Yianilos, Proceedings of the Forum on Research & Technology Advances in Digital Libraries, pp 66-75, 1996. PicHunter: Bayesian Relevance Feedback for Image Retrieval, I.J. Cox, Matt Miller, Stephen Omohundo, P.N. Yianilos, 13th International Conference on Pattern Recognition, Vol.III, Track C, pp , August 1996. Introduces PicHunter, the Bayesian framework, and describes a working system including measured user performance. Hidden Annotation in Content Based Image Retrieval, I.J. Cox, Joumana Ghosn, Matt Miller, T. Papathomas, P.N. Yianilos, IEEE Workshop on Content-Based Access of Image & Video Libraries, pp.76-81, June 1997 Introduces the idea of ``hidden annotation'', and reports results demonstrating that it improves performance.
3
Project overview An Optimized Interaction Strategy for Bayesian Relevance Feedback, I. J. Cox, M. L. Miller, T. Minka, P. N. Yianilos, IEEE International Conference on Computer Vision and Pattern Recognition - CVPR '98, Santa Barbara, CA, pp , 1998. Introduces an improved stochastic image display strategy allowing the system to ``ask better questions.'' Psychophysical Studies of the Performance of an Image Database Retrieval System, T. Papathomas, T. Conway, I. Cox, J. Ghosn, M. Miller, T. Minka, P. Yianilos, Proceedings of the Human Vision & Electric Imaging III, San Jose, CA Vol 3299, pp , January 1998 Describes Psychophysical studies of the system in a controlled environment.
4
Project summary The Bayesian Image Retrieval System, PicHunter: Theorgy, Implementation and Psychophysical Experiements, I. J. Cox, M. L. Miller, T. P. Minka, T. V. Papathomas, P. N. Yianilos, IEEE Transactions on Image Processing, 9, 1, 20-37, (2000)
5
Introduction A search consists of To date, emphasis on query phase
Repeated relevance feedback To date, emphasis on query phase better representations, relevance feedback crude or non-existent Lack of quantitative measures for comparing performance of search algorithms
6
The main ideas Bayesian relevance feedback Quantifiable testing
Learn from human interactions Model the user's actions, not his/her query Quantifiable testing Target testing Baseline testing Optimize the image display
7
User interface
8
Target testing The user is shown an image from the database. His/her task is to use the system to find it. We measure the number of interactions required. This, then, is easily compared against a simple linear search Not a perfect model for all intended uses --- but something we can measure and use for comparisons
9
Features Pictorial features Originally 18 global features
% of pixels that are one of 11 colors Mean color saturation Median intensity of the image Image width and height A measure of global contrast Two measures of the number of edgels computed at different thresholds
10
Features Hidden annotation Provides semantic labels 147 attributes
Boolean vector, normalized Hamming distance
11
Bayesian relevance feedback
At denotes the current user action, Dt is the current display H the session history including the current images displayed. Thus, Ht = {D1, A1, D2, A2,… Dt, At} T is a target image.
12
Bayesian relevance feedback
We build a predictive model P(A|T,H) Then from Bayes rule
13
Bayesian relevance feedback
Assume time-invariance and same for all users
14
Absolute-distance model
Only one image, Xq, in the display Dt can be selected at each iteration The probability of Ti increases or decreases depending on the distance d(Ti, Xq) P(T=Ti) = P(T=Ti) G(d(Ti, Xq))
15
Relative-distance model
Let Q={Xq1, Xq2,…XqC} denote the set of selected in images in display Dt and Let N={Xn1, Xn2 …XnL} denote the set of unselected images Then we compute the distance difference d(Ti, Xqk) – d(T1,Xnm) for all pairs {Xqk, Xnm} The probabilities of images Tc that are closer to Xqk are increased while those closer to Xnm are decreased.
16
Display updating algorithm
Most probable display Most informative display (Max. mutual information) Sampling Query by example
17
Experimental setup Database of 4522 images M/N, A/R, P/S/B
1500 annotated M/N, A/R, P/S/B Memory/ no memory (relevance feedback history) Absolute / relative distance Pictorial / semantic/ both features
18
Experimental notation
MRB – memory, relative distance, pictorial and semantic features MAB – memory, absolute distance, pictorial and semantic features NRB – no memory, … NAB MRS – memory, relative, semantic features MRP – memory, relative, pictorial features
19
Experimental results Memory, metric and features
MRB MAB NRB NAB MRS MRP 6 naïve users 25.4 35.8 45.5 33.2 15.6 35.1 2 exp. users 13.1 31.6 28.4 22.2 8.8 18.9
20
Baseline testing Similarity testing
How many images are examined before the user sees a similar image? Compare to number needed when randomly searching the database
21
Target versus category search
MRB/T MRS/T MRB/C RAND/C Naïve users 25.4 15.6 12.2 19.7 Exp. users 13.1 8.8 8.9 20.1
22
Improved pictorial features
HSV 64-element histogram HSV 256-element autocorrelogram RGB 128-element color coherence vector
23
Experimental results (User learning)
Before explanation After explanation Pictorial only 17.1 13.2 Pictorial and semantic 11.7 9.5
24
Display updating algorithms
Most probable display Most informative display (Max. mutual information) Sampling Query by example
25
Most Probable Display Performs quite well
However, greed strategy suffers from “over-learning” PicHunter “gets stuck” in a local maximum Display after display of “lions”, say
26
Most-Informative Display
Try to minimize the total number of iterations required in a search Try to elicit as much information from the user as possible Information theory suggests entropy as an estimate of the number of questions one needs to ask to resolve the ambiguity
27
Most-informative display
Consider the ideal (deterministic) case, in which the display consists of two images
28
Most-informative display
Generalization to the non-deterministic case
29
Most informative display
To perform minimization is non-trivial Perform Monte Carlo simulation Draw random displays {X1, X2… XND} from the distribution P(T=Ti) Sampling is a special case of most informative method where only one Monte Carlo sample is drawn
30
Simulation results: deterministic
31
Simulation results: deterministic
32
Simulation results: non-deterministic
33
Simulation results: non-deterministic
34
Experimental results: Display strategies
EB’ EP’ ES RB’ MRB’ AB’ NAB’ RS MRS RP MRP Naïve users 11.3 25.8 16.0 12.0 20.4 11.8 29.6 Exp. users 6.8 10.2 8.3 8.65 11.5
35
Future directions More efficient algorithms
Automatic detection of hidden features Explore slightly richer user interfaces Explore increased use of online learning
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.