Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State.

Similar presentations

Presentation on theme: "Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State."— Presentation transcript:

1 Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State University, ETH Zurich

2 Annotation/tagging is essential to making images accessible to Web users Billions of images on the Web lack proper annotation/tags Automatic image annotation has been actively studied in multimedia community

3 Social media data in social websites enjoy rich tagging information provided by Web users Can we resolve the challenge of auto-photo annotation by leveraging the emerging huge amount of rich social media data?

4 Annotation by Search from Social Images Sun Bird Sky Blue … Sun Bird Sky Blue … Bird Fly White Cloud … Bird Fly White Cloud … Sun Cloud Hawk Fly … Sun Cloud Hawk Fly … Hawk Bird Sky Eagle … Hawk Bird Sky Eagle …

5 Annotation by Search Find similar image from social image DB Annotate the image by the tags of high frequency Research Challenges Visual feature representation Tag data mining Scalable search & indexing Distance/similarity measure Distance Metric Learning

6 Related Work of Automated Photo Tagging Built a large collection of web images with ground truth labels for helping object recognition research (Russell et al. 2008) A fast search-based approach for image annotation by some efficient hashing technique (Wang et al. 2006) Utilized visual and text modalities simultaneously in clustering images (Rege et al. 2008) Efficient image search and scene matching techniques for exploring a large-scale web image repository. (Torralba et al. 2008) Learning based method for improving the efficiency of manual image annotation (Yan et al. 2008) Adopt Hamming or Euclidian distance

7 Distance Measure Hamming distance Euclidian distance Mahanalobis distance Distance Metric Learning Learning to optimize the metric M Side Information (a.k.s. Pairwise Constraints) Similar pairs S(x1, x2) : x1 and x2 belong to the same category Dissimilar pairs D(x1, x2): x1 and x2 belong to different categories

8 Related Work on Distance Metric Learning Probablistic Global Distance Metric Learning (PGDM) (Xing et al. 2002) Neighbourhood Components Analysis (NCA) (Goldberger et al. 2005) Relevance Component Analysis (RCA) (Bar-Hillel et al. 2005) Discriminative Component Analysis (DCA) (Hoi et al. 2006) Large Margin Nearest Neighbor (LMNN) (Weinberger et al. 2006) Regularized Distance Metric Learning (RDML) (Si et al. 2006) Information-Theoretic Metric Learning (ITML) (Davis et al. 2007) Clean side information is given explicitly

9 Annotation by Search from Social Media NO explicit pairwise side information available But rich information is available with social images Ideas of our research To discover implicit pairwise relationship between social images via a probabilistic approach To learn effective distance metrics from uncertain side information that is discovered from social images implicitly

10 Overview of Our Approaches Discovery of probabilistic side information A Graphical Model Approach Learning distance metrics from probabilistic side information A Probabilistic RCA Method Automated photo tagging by applying the optimized metric in visual similarity search


12 Problem Formulation Latent Chunklets i.e., the hidden topics Assumption both visual images and associated textual metadata are generated from the hidden topic Calculation Multi-model hidden topic analysis

13 Text Modality Visual Modality Hidden Topic

14 Generation Process Inference Probabilistic Side Info., as Prior Prob. Matrix

15 Problem Definition and Notations Probabilistic Side Info.: Centers/Means for the Latent Chunklets Membership Probability Given the estimation of latent chunklets P 0, how to formulate the DML problem to find the optimal metric M? Propose an extension of RCA with Prob. Side Info.

16 The objective function of pRCA: means Corollary 1. When fixing the means of chunklets μ and the matrix of probability assignments P (assuming with hard assignments of 0 and 1), the Probabilistic Relevance Component Analyasis (pRCA) formulation reduces to the regular RCA learning. Minimize Sum of square distances of examples from their chunklets centers regularization preventing the trivial solution

17 Iterative algorithm Fixing P and μ to optimize M: Fixing M and μ to optimize P: Fixing P and M to find μ:

18 pRCA Algorithm

19 Query image Steps of Auto Photo Tagging via Search Distance/Similarity Measure To retrieve a set of visually similar social photos Set of k-Nearest Neighbor Images Set of images with distance less than some threshold

20 Annotating the query photo by the relevant tags associated with the set of similar images A tag is more preferred if it has a higher frequency among the set of similar social images A tag is more preferred if its associated social image are visually more similar to the query photo Our tagging approach Frequency of tag w among the retrieved social images Average distance from the query photo to the tags associated social images

21 Experimental Testbed Totally 205,442 photos from Flickr Distance Metric Learning: 16,588 photos + tags Knowledge Database: 186,854 photos + tags Query Image: 2,000 random photos Compared Schemes: Relevance Component Analysis (RCA) Discriminative Component Analysis (DCA) Information-Theoretic Metric Learning (ITML) Large Margin Nearest Neighbor (LMNN) Neighbourhood Components Analysis (NCA) Regularized Distance Metric Learning (RDML)

22 Settings: 500 latent chunklets 1,000 visual words 10,000 tags Learning rate γ=0.5 Top k nearest photos, k=30 Top t relevant tags for annotation, t=1,…,10

23 Fixed the number of nearest neighbors k to 30 for all compared methods



26 DML techniques are beneficial and critical to the retrieval-based photo tagging tasks In general, pRCA algorithm considerably outperformed other approaches in most cases. For some cases, some DML methods did not perform well, which could be even worse than the Euclidean method. Noisy (uncertain) side information issue Robustness is important to DML

27 Examine the annotation performance of pRCA by varying the value of k from 10 to 50

28 The number of nearest neighbors parameter k can influence the annotation performance In our case, when k equals to 30, the resulting performance is generally better than others Too large k, lots of noisy tags may be included as there may not exist many relevant images in the database. Too small k, some relevant tags may not appear, which again may degrade the performance

29 To evaluate the time efficiency performance of the proposed DML algorithm on the same dataset Findings The most efficient method is the regular RCA approach The most time-consuming one is NCA pRCA is quite competitive, which is worse than RCA,DCA, and RDML, but is considerably better than ITML, LMNN, and NCA

30 Query PhotoTop Recommended Tags autumn, fall, forest, trees, nature, tree wood, germany, path, creative sunset, clouds, sky, sea, beach, abigfave, sun, water, landscape, ocean tiger, zoo, specanimal, impressedbeauty, abigfave, nature, animal, cat, animals, aplusphoto garden, flowers, yellow, nature, hdr, nikon, spring, festival, impressed beauty

31 Query PhotoTop Recommended Tags macro, nikon, bokeh, nature, flower, canon, storm, eos, plane, flickrsbest nikon, street, water, sport, blue, bike, lebanon, kids, eric mckenna, krissy mckenna winter, photography, art, beach usa, fashion, portrait, travel, party, snow park, river, travel, trees, lake, hiking, winter, green, vacation, water

32 Contributions: Study DML from uncertain side information that exploits probabilistic side information Propose a two-step probabilistic distance metric learning (PDML) framework Present an effective probabilistic RCA (pRCA) algorithm Apply the algorithm to the auto photo annotation by search task Encouraging results showed that our technique is effective and promising

33 To improve visual feature representation, especially for annotating objects. To expand the scale of database To improve large scale search & indexing To filter spam and irrelevant tags To adopt users feedback to improve automated tagging performance on APT.

34 More information is available: Http:// Online demo of Auto Photo Tagging (APT) is available: Http:// Contact: WU Lei Steven CH Hoi School of Computer Engineering Nanyang Technological University Singapore 639798 Email: Tel: (+65) 6513-8040 Fax: (+65) 6792-6559 Http://


36 Inference Joint probability on documents and topics Conditional probability on tags, visual words and topics Gibbs sampling estimation


Download ppt "Lei Wu *, Steven C.H. Hoi *, Rong Jin #, Jianke Zhu, Nenghai Yu * Nanyang Technological University, University of Sci. & Tech. of China, # Michigan State."

Similar presentations

Ads by Google