Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK.

Slides:



Advertisements
Similar presentations
A Comparison of Implicit and Explicit Links for Web Page Classification Dou Shen 1 Jian-Tao Sun 2 Qiang Yang 1 Zheng Chen 2 1 Department of Computer Science.
Advertisements

CS5038 Tom Campbell 1 Can web site consumer eye- movement analysis significantly benefit a web site business?
Fawaz Ghali Web 2.0 for the Adaptive Web.
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
PhishZoo: Detecting Phishing Websites By Looking at Them
Presented by, Biswaranjan Panda and Moutupsi Paul Beyond Nouns -Exploiting Prepositions and Comparative Adjectives for Learning Visual Classifiers Ref.
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Segmentation and Classification Optimally selected HMMs using BIC were integrated into a Superior HMM framework A Soccer video topology was generated utilising.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 12 View Design and Integration.
Inferring User Political Preferences from Streaming Communications Svitlana Volkova 1, Glen Coppersmith 2 and Benjamin Van Durme 1,2 1 Center for Language.
Elisavet Chatzilari, Spiros Nikolopoulos, Yiannis Kompatsiaris, Josef Kittler Information Technologies Institute Elisavet Chatzilari
A Unified Framework for Context Assisted Face Clustering
Automatic Image Annotation Using Group Sparsity
Diversified Retrieval as Structured Prediction Redundancy, Diversity, and Interdependent Document Relevance (IDR ’09) SIGIR 2009 Workshop Yisong Yue Cornell.
Evaluating Color Descriptors for Object and Scene Recognition Koen E.A. van de Sande, Student Member, IEEE, Theo Gevers, Member, IEEE, and Cees G.M. Snoek,
UCB Computer Vision Animals on the Web Tamara L. Berg CSE 595 Words & Pictures.
AUTOMATICALLY CITE YOUR SOURCES FOR FREE AT
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
A one player game where players are asked to tag funny video clips in a given time frame. They will score points throughout the game and be entered into.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Lecture 26: Vision for the Internet CS6670: Computer Vision Noah Snavely.
QueryAnnotationsImages Search Result Text Based : Annotation by surrounding text Content Based : Annotation by the content of images Social Based.
Nonnegative Shared Subspace Learning and Its Application to Social Media Retrieval Presenter: Andy Lim.
Time-Sensitive Web Image Ranking and Retrieval via Dynamic Multi-Task Regression Gunhee Kim Eric P. Xing 1 School of Computer Science, Carnegie Mellon.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Unsupervised object discovery via self-organisation Presenter : Bo-Sheng Wang Authors: Teemu Kinnunen, Joni-Kristian Kamarainen, Lasse Lensu, Heikki Kälviäinen.
Classifying Tags Using Open Content Resources Simon Overell, Borkur Sigurbjornsson & Roelof van Zwol WSDM ‘09.
Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata Anon Plangprasopchok 1, Kristina Lerman 1, Lise Getoor 2 1 USC.
Human abilities Presented By Mahmoud Awadallah 1.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Information Systems & Semantic Web University of Koblenz ▪ Landau, Germany Semantic Web - Multimedia Annotation – Steffen Staab
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.
Recommendation system MOPSI project KAROL WAGA
ON INCENTIVE-BASED TAGGING Xuan S. Yang, Reynold Cheng, Luyi Mo, Ben Kao, David W. Cheung {xyang2, ckcheng, lymo, kao, The University.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
Crowdsourcing Color Perceptions using Mobile Devices Jaejeung Kim 1, Sergey Leksikov 1, Punyotai Thamjamrassri 2, Uichin Lee 1, Hyeon-Jeong Suk 2 1 Dept.
Category Discovery from the Web slide credit Fei-Fei et. al.
Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.
Summary Marie Yarbrough. Introduction History of Image Forgery Method Segmentation Classification Common-Sense Reasoning Conclusion.
Reading Between The Lines: Object Localization Using Implicit Cues from Image Tags Sung Ju Hwang and Kristen Grauman University of Texas at Austin Jingnan.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Flickr the framework of Flickr. Observe them  How many photos does each user offer?  How many tags does each photo have?  The tag hot-list  How many.
Beyond Nouns Exploiting Preposition and Comparative adjectives for learning visual classifiers.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Center for E-Business Technology Seoul National University Seoul, Korea Social Ranking: Uncovering Relevant Content Using Tag-based Recommender Systems.
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
Understanding User Goals in Web Search University of Seoul Computer Science Database Lab. Min Mi-young.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC
Human Computation (aka Crowdsourcing) LUIS VON AHN Slides taken from a talk by.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Hypertext Categorization using Hyperlink Patterns and Meta Data Rayid Ghani Séan Slattery Yiming Yang Carnegie Mellon University.
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
Flickr Tag Recommendation based on Collective Knowledge Hyunwoo Kim SNU IDB Lab. August 27, 2008 Borkur Sigurbjornsson, Roelof van Zwol Yahoo! Research.
Sentiment Analysis Using Common- Sense and Context Information Basant Agarwal 1,2, Namita Mittal 2, Pooja Bansal 2, and Sonal Garg 2 1 Department of Computer.
SHAHAB iCV Research Group.
Saliency-guided Video Classification via Adaptively weighted learning
Personalized Social Image Recommendation
Week 6 Cecilia La Place.
Personalizing Search on Shared Devices
Accounting for the relative importance of objects in image retrieval
Matching Words with Pictures
Semantic Similarity Methods in WordNet and their Application to Information Retrieval on the Web Yizhe Ge.
Presentation transcript:

Collections for Automatic Image Annotation and Photo Tag Recommendation Philip McParlane, Yashar Moshfeghi and Joemon M. Jose University of Glasgow, UK

Motivation for annotating images Problems with existing automatic image annotation collections Problems with existing photo tag recommendation collections Flickr-AIA Flickr-PTR Conclusions We introduce

Motivation for annotating images Problems with existing automatic image annotation collections Problems with existing photo tag recommendation collections Flickr-AIA Flickr-PTR Conclusions We introduce

With the amount of multimedia data rapidly increasing, it becomes important to organize this content effectively Social image sharing websites depend on manual annotation of their images. This has a large human cost however. Plus, humans often tag with irrelevant tags (e.g. girl) or tags which are opinionated (e.g. cool) etc.

With the amount of multimedia data rapidly increasing, it becomes important to organize this content effectively Social image sharing websites depend on manual annotation of their images. This has a large human cost however. Plus, humans often tag with irrelevant tags (e.g. girl) or tags which are opinionated (e.g. cool) etc.

With the amount of multimedia data rapidly increasing, it becomes important to organize this content effectively Social image sharing websites depend on manual annotation of their images. This has a large human cost however. Plus, humans often tag with irrelevant tags (e.g. girl) or tags which are opinionated (e.g. cool) etc.

Therefore, research has focussed on the automatic annotation of images Automatic image annotation (AIA) Photo Tag Recommendation (PTR) AIA Collections Many public collections used PTR Collections Mostly non-public collections used Evaluated on Corel5k, Corel30k, ESP Game, IAPR, Google Images, LabelMe, Washington Collection, Caltech, TrecVid 2007, Pascal 2007, MiAlbum & 4 other small collections. The 20 most cited AIA papers on CiteSeerX revealed that at least 15 collections had been used… For photo tag recommendation, the most popular works use their own collections. considers the pixels considers those tags already added Evaluated on

Therefore, research has focussed on the automatic annotation of images Automatic image annotation (AIA) Photo Tag Recommendation (PTR) AIA Collections Many public collections used PTR Collections Mostly non-public collections used Evaluated on Corel5k, Corel30k, ESP Game, IAPR, Google Images, LabelMe, Washington Collection, Caltech, TrecVid 2007, Pascal 2007, MiAlbum & 4 other small collections. The 20 most cited AIA papers on CiteSeerX revealed that at least 15 collections had been used… For photo tag recommendation, the most popular works use their own collections. considers the pixels considers those tags already added Evaluated on

Therefore, research has focussed on the automatic annotation of images Automatic image annotation (AIA) Photo Tag Recommendation (PTR) AIA Collections Many public collections used PTR Collections Mostly non-public collections used Evaluated on Corel5k, Corel30k, ESP Game, IAPR, Google Images, LabelMe, Washington Collection, Caltech, TrecVid 2007, Pascal 2007, MiAlbum & 4 other small collections. The 20 most cited AIA papers on CiteSeerX revealed that at least 15 collections had been used… For photo tag recommendation, the most popular works use their own collections. considers the pixels considers those tags already added Evaluated on

Motivation for annotating images Problems with existing automatic image annotation collections Problems with existing photo tag recommendation collections Flickr-AIA Flickr-PTR Conclusions We introduce

In this work we consider 3 popular AIA evaluation collections used by recent work [4] Corel5k [1] IAPR TC-12 [2] ESP Game [3] [1] Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. P. Duygulu et al. ECCV 02. [2] Labeling images with a computer game. L. von Ahn and L. Dabbish. CHI '04. [3] The IAPR TC-12 Benchmark- A New Evaluation Resource. M. Grubinger, et al. Visual Information Systems, [4] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010 Automatic Image Annotation

What are the problems with previous automatic image annotation collections?

Too many collections There needs to a be a single, openly available collection to reproduce experiments.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc. Unnormalized Models are able to exploit popular tags by promoting them, increasing performance measures.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc. Unnormalized Models are able to exploit popular tags by promoting them, increasing performance measures. Low Image Quality Models are often tested on small, low quality image collections.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc. Unnormalized Models are able to exploit popular tags by promoting them, increasing performance measures. Low Image Quality Models are often tested on small, low quality image collections. Lack of Meta-data Despite the increase in research considering time/location etc, these collections dont include these.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc. Unnormalized Models are able to exploit popular tags by promoting them, increasing performance measures. Low Image Quality Models are often tested on small, low quality image collections. Lack of Meta-data Despite the increase in research considering time/location etc, these collections dont include these. Lack of diversity Collections often contain images taken on the same camera, at the same place by the same user.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc. Unnormalized Models are able to exploit popular tags by promoting them, increasing performance measures. Low Image Quality Models are often tested on small, low quality image collections. Lack of Meta-data Despite the increase in research considering time/location etc, these collections dont include these. Lack of diversity Collections often contain images taken on the same camera, at the same place by the same user. Location Tags Locations, such as usa, are impossible to identify, however, these tags are often included in ground truths.

What are the problems with previous automatic image annotation collections? Too many collections There needs to a be a single, openly available collection to reproduce experiments. Annotation Ambiguity Collections use many synonyms in the annotation of images. e.g. usa/america etc. Unnormalized Models are able to exploit popular tags by promoting them, increasing performance measures. Low Image Quality Models are often tested on small, low quality image collections. Lack of Meta-data Despite the increase in research considering time/location etc, these collections dont include these. Lack of diversity Collections often contain images taken on the same camera, at the same place by the same user. Location Tags Locations, such as usa, are impossible to identify, however, these tags are often included in ground truths. Copyright Corel is bound by copyright making distribution difficult.

1. Problems Annotation Ambiguity All three collection ground truths contain: Synonyms (e.g. america/usa) Visually identical classes (e.g. sea/ocean) Corel5k IAPR TC-12 ESP Game

1. Problems Annotation Ambiguity All three collection ground truths contain: Synonyms (e.g. america/usa) Visually identical classes (e.g. sea/ocean) To demonstrate this, we cluster tags which share a common WordNet synonym set (removing irrelevant matches manually). Corel5k IAPR TC-12 ESP Game

Corel polar/arctic ocean/sea 36 of 374 tags ice/frost

Corel polar/arctic ocean/sea ESP baby/child child/kid home/house 37 of 291 tags 36 of 374 tags ice/frost

Corel polar/arctic ocean/sea ESP baby/child child/kid home/house IAPR woman/adult bush/shrub rock/stone 37 of 291 tags 26 of 269 tags 36 of 374 tags ice/frost

31% of photos in the Corel collection contain at least 1 ambiguous tag.

31% of photos in the Corel collection contain at least 1 ambiguous tag. 25% of photos in the ESP collection contain at least 1 ambiguous tag.

31% of photos in the Corel collection contain at least 1 ambiguous tag. 25% of photos in the ESP collection contain at least 1 ambiguous tag. sixty three% of photos in the IAPR collection contain at least 1 ambiguous tag.

Annotations: sea, usa, sky, chair Test image #1

Annotations: sea, usa, sky, chair Annotation Model #1 sea, usa, blue, water, red Test image #1 Precision 0.4 Suggestions

Annotations: sea, usa, sky, chair Annotation Model #1 Annotation Model #2 sea, usa, blue, water, red Test image #1 ocean, america, blue, water, red Precision Suggestions

Annotations: sea, usa, sky, chair Annotation Model #1 Annotation Model #2 sea, usa, blue, water, red Test image #1 ocean, america, blue, water, red Precision Suggestions So why do we penalize a system which treats these concepts differently? It is impossible to tell from an images pixels, whether it is of the sea or of the ocean.

2. Problems Unnormalised Collections By nature, the classes used in image collections follow a long tail distribution i.e. there exist a few popular tags and many unpopular tags This causes problems: 1.Selection Bias: Popular tags exist in more training and test images. Therefore, annotation models are more likely to test on popular classes. 2.Prediction Bias: Popular tags occur in more test images. Therefore, annotation models can potentially cheat by promoting only popular tags, instead of making predictions based purely on the pixels. Corel5k IAPR TC-12 ESP Game

2. Problems Unnormalised Collections By nature, the classes used in image collections follow a long tail distribution i.e. there exist a few popular tags and many unpopular tags This causes problems: 1.Selection Bias: Popular tags exist in more training and test images. Therefore, annotation models are more likely to test on popular classes. 2.Prediction Bias: Popular tags occur in more test images. Therefore, annotation models can potentially cheat by promoting only popular tags, instead of making predictions based purely on the pixels. Corel5k IAPR TC-12 ESP Game

2. Problems Unnormalised Collections By nature, the classes used in image collections follow a long tail distribution i.e. there exist a few popular tags and many unpopular tags This causes problems: 1.Selection Bias: Popular tags exist in more training and test images. Therefore, annotation models are more likely to test on popular classes. 2.Prediction Bias: Popular tags occur in more test images. Therefore, annotation models can potentially cheat by promoting only popular tags, instead of making predictions based purely on the pixels. Corel5k IAPR TC-12 ESP Game

Tags # Images To demonstrate this prediction bias (i.e. where annotation models can cheat by promoting popular tags) We annotate each collection using the annotation model described in [6]. We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags. For each, we suggest only the tags in each subset. [6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010 Corel5k IAPR TC-12 ESP Game

Popular tags Tags # Images To demonstrate this prediction bias (i.e. where annotation models can cheat by promoting popular tags) We annotate each collection using the annotation model described in [6]. We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags. For each, we suggest only the tags in each subset. [6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010 Corel5k IAPR TC-12 ESP Game

Popular tags Medium Frequency Tags # Images To demonstrate this prediction bias (i.e. where annotation models can cheat by promoting popular tags) We annotate each collection using the annotation model described in [6]. We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags. For each, we suggest only the tags in each subset. [6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010 Corel5k IAPR TC-12 ESP Game

Popular tags Medium Frequency Unpopular Tags Tags # Images To demonstrate this prediction bias (i.e. where annotation models can cheat by promoting popular tags) We annotate each collection using the annotation model described in [6]. We split the vocabulary into 3 subsets of popular, medium frequency and unpopular tags. For each, we suggest only the tags in each subset. [6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010 Corel5k IAPR TC-12 ESP Game

Ultimately, higher annotation accuracy can be achieved by suggesting only popular tags.

3. Problems Image quality/size Despite the increase in Hadoop clusters & computational power, many works still test on small collections of low quality images. CollectionSize (avg dimension)# Images Corel160px5,000 ESP156px22,000 IAPR417px20,000 Corel5k IAPR TC-12 ESP Game

4. Problems Lack of meta-data Many recent works have focussed on the exploitation of various meta-data [7,8] e.g. Time, Location, Camera, User CollectionTimeLocation CorelXX ESPXX IAPR [7] On contextual photo tag recommendation. P McParlane, Y Moshfeghi, J Jose SIGIR 2013 [8] Beyond co-occurrence: discovering and visualizing tag relationships from geo-spatial. H Zhang et al. WSDM 2012 Corel5k IAPR TC-12 ESP Game

5. Problems Lack of diversity Images in each collection are often taken by the same user, in the same place, of the same scene/object, using the same camera. This leads to natural clustering in image collections, making annotation easier due to high inter-cluster visual similarity. Further there are duplicate images in the test and train sets, also making annotation easier. Corel5k IAPR TC-12 ESP Game

6. Problems Identifying Location Identifying location (even high level) within an image is often difficult or sometimes impossible. Despite this, two of the three collections contain images annotated with locations (e.g. usa). Given this image, would you know where it was taken? Corel5k IAPR TC-12 ESP Game

6. Problems Identifying Location Identifying location (even high level) within an image is often difficult or sometimes impossible. Despite this, two of the three collections contain images annotated with locations (e.g. usa). Given this image, would you know where it was taken? Annotations: sea, usa, sky, chair If not, how can we expect an annotation model to predict the annotation usa? Corel5k IAPR TC-12 ESP Game

7. Problems Copyright An evaluation collection should at least be free and distributable. Unfortunately, the Corel collection is commercial and bound by copyright. Corel5k IAPR TC-12 ESP Game

Motivation for annotating images Problems with existing automatic image annotation collections Problems with existing photo tag recommendation collections Flickr-AIA Flickr-PTR Conclusions We introduce

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind.

Openly available Using Flickr images under the creative commons license

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind. Openly available Using Flickr images under the creative commons license Meta-data Includes extensive location, user and time meta- data.

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind. Openly available Using Flickr images under the creative commons license Meta-data Includes extensive location, user and time meta- data. Diverse image set We search images for 2,000 WordNet categories & limit the number of images for each user.

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind. Openly available Using Flickr images under the creative commons license Meta-data Includes extensive location, user and time meta- data. Diverse image set We search images for 2,000 WordNet categories & limit the number of images for each user. High quality The dimension of each image is 719px on average.

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind. Openly available Using Flickr images under the creative commons license Meta-data Includes extensive location, user and time meta- data. Diverse image set We search images for 2,000 WordNet categories & limit the number of images for each user. High quality The dimension of each image is 719px on average. No Location tags We use WordNet to remove location tags from image ground truths (e.g. scotland).

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind. Openly available Using Flickr images under the creative commons license Meta-data Includes extensive location, user and time meta- data. Diverse image set We search images for 2,000 WordNet categories & limit the number of images for each user. High quality The dimension of each image is 719px on average. No Location tags We use WordNet to remove location tags from image ground truths (e.g. scotland). Resolved Ambiguity Tags which are synonyms (e.g. usa/america) are merged based on WordNet synonym sets.

Flickr-AIA contains 312,000 images from Flickr built with AIA evaluation in mind. Openly available Using Flickr images under the creative commons license Meta-data Includes extensive location, user and time meta- data. Diverse image set We search images for 2,000 WordNet categories & limit the number of images for each user. High quality The dimension of each image is 719px on average. No Location tags We use WordNet to remove location tags from image ground truths (e.g. scotland). Resolved Ambiguity Tags which are synonyms (e.g. usa/america) are merged based on WordNet synonym sets. Normalized Aside from the normal ground truth, we include a normalised ground truth containing only medium frequency tags.

Motivation for annotating images Problems with existing automatic image annotation collections Problems with existing photo tag recommendation collections Flickr-AIA Flickr-PTR Conclusions We introduce

In this work we consider collection used in 2 popular photo tag recommendation works. Photo Tag Recommendation [5] Flickr Tag Recommendation based on Collective Knowledge. B. Sigurbjornsson and R. van Zwol. WWW '08. [6] Personalized, Interactive Tag Recommendation for Flickr. N. Garg and I. Weber. ACM RecSys '08. Sigurbjornsson [5] Garg [6]

1. Problems Ground Truth Sigurbjornsson use a small collection of images which have their ground truths crowdsourced. For photo tag recommendation, however, many aspects that users would tag are often not explicit (e.g. locations, dates etc). Therefore, these annotations are missed using crowdsourcing. Sigurbjornsson Garg

Crowdsourced Worker footballred teamblue england grass saturday Comparing Annotations Crowdsourced vs tags added by the user.

Crowdsourced footballred teamblue england grass saturday Comparing Annotations Crowdsourced vs tags added by the user. User Tags footballred teamblue scotland hamilton accies artificial grass dunfermline sundaynew douglas park

Crowdsourced footballred teamblue england grass saturday Comparing Annotations Crowdsourced vs tags added by the user. User Tags footballred teamblue scotland hamilton accies artificial grass dunfermline sundaynew douglas park

2. Problems Synonymous Tags One of the problems with using user tags, however, is that users often use many synonyms to annotate images. Sigurbjornsson Garg

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare Test image #1

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare Annotation Model #1 ny, nyc, newyork, york, city Test image #1 Precision 0.8 Suggestions

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare Annotation Model #1 Annotation Model #2 ny, nyc, newyork, york, city Test image #1 ny, timessquare, people, cab, empire Precision Suggestions

Annotations: newyork, ny, nyc, newyorkcity, york, timessquare Annotation Model #1 Annotation Model #2 ny, nyc, newyork, york, city Test image #1 ny, timessquare, people, cab, empire Precision Suggestions

3. Problems Free distribution Existing collections [6,7] for photo tag recommendation were never released making comparable experiments difficult. [6] Personalized, Interactive Tag Recommendation for Flickr. N. Garg and I. Weber. ACM RecSys '08. [7] Flickr Tag Recommendation based on Collective Knowledge. B. Sigurbjornsson and R. van Zwol. WWW '08. Sigurbjornsson Garg

Motivation for annotating images Problems with existing automatic image annotation collections Problems with existing photo tag recommendation collections Flickr-AIA Flickr-PTR Conclusions We introduce

Flickr-PTR contains details of 2,000,000 images from Flickr built with PTR evaluation in mind.

Openly available Using Flickr images under the creative commons license

Flickr-PTR contains details of 2,000,000 images from Flickr built with PTR evaluation in mind. Openly available Using Flickr images under the creative commons license Clustered User Tags Using a crowdsourced experiment which asked user to group related tags to overcome the problem of synonyms.

To overcome synonyms in image annotations we took out an crowdsourced experiment which grouped the synonyms or the tags which refer to the same aspect.

Conclusions This work highlighted: 7 problems with existing AIA collections (Corel, ESP, IAPR) 3 problems with existing PTR collections (Sigur, Garg) With this in mind, we introduce two new, freely available image collections: Flickr-AIA 312,000 Flickr images Flickr-PTR 2,000,000 Flickr images Automatic image annotation evaluation Photo tag recommendation evaluation

These collections are available at: Thanks for listening! [1] Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. P. Duygulu et al. ECCV 02. [2] Labeling images with a computer game. L. von Ahn and L. Dabbish. CHI '04. [3] The IAPR TC-12 Benchmark- A New Evaluation Resource. M. Grubinger, et al. Visual Information Systems, [4] Flickr Tag Recommendation based on Collective Knowledge. B. Sigurbjornsson and R. van Zwol. WWW '08. [5] Personalized, Interactive Tag Recommendation for Flickr. N. Garg and I. Weber. ACM RecSys '08. [6] Baselines for Image Annotation. A. Makadia, V. Pavlovic and S. Kumar. IJCV 2010 [7] On contextual photo tag recommendation. P McParlane, Y Moshfeghi, J Jose SIGIR 2013 [8] Beyond co-occurrence: discovering and visualizing tag relationships from geo-spatial. H Zhang et al. WSDM 2012