By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

1 Autocompletion for Mashups Ohad Greenshpan, Tova Milo, Neoklis Polyzotis Tel-Aviv University UCSC.
Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.
HT06, Position Paper, Tagging, Taxonomy, Flickr, Academic Article, ToRead, Presentation Cameron Marlow, Mor Naaman, danah boyd, Marc Davis Yahoo! Research.
Query Dependent Pseudo-Relevance Feedback based on Wikipedia SIGIR ‘09 Advisor: Dr. Koh Jia-Ling Speaker: Lin, Yi-Jhen Date: 2010/01/24 1.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Evaluating Search Engine
Tagging Systems Austin Wester. Tags A keywords linked to a resource (image, video, web page, blog, etc) by users without using a controlled vocabulary.
Tagging Systems Mustafa Kilavuz. Tags A tag is a keyword added to an internet resource (web page, image, video) by users without relying on a controlled.
Modern Information Retrieval Chapter 2 Modeling. Can keywords be used to represent a document or a query? keywords as query and matching as query processing.
Tags, Networks, Narrative Explorations in Folksonomy Sue Thomas and Bruce Mason IOCT, De Montfort University 30 th January 2007.
Recommender Systems; Social Information Filtering.
Information Retrieval
The Social Web: A laboratory for studying s ocial networks, tagging and beyond Kristina Lerman USC Information Sciences Institute.
Overview of Search Engines
1 SOCIAL BOOKMARKING 101. HIBA KHALID BILAL SAEED KHAN FARID ALIANI ASKARI HASAN SOCIAL BOOKMARKING.
Towards Boosting Video Popularity via Tag Selection Elizeu Santos-Neto, Tatiana Pontes, Jussara Almeida, Matei Ripeanu University of British Columbia -
Tag-based Social Interest Discovery
Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.
Web 2.0: Concepts and Applications 4 Organizing Information.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
Zhichen Xu, Yun Fu, Jianchang Mao, and Difu Su Yahoo! Inc 2821 Mission College Blvd., Santa Clara, CA {zhichen, yfu, jmao, Towards.
JENNIE MATHEWS ST. JOHN’S UNIVERSITY LIS 239 Can the Addition of Social Software Tools & Tags Improve the Productivity of an Academic Library OPAC? 1.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor : Jia Ling, Koh Speaker : SHENG HONG, CHUNG.
« Pruning Policies for Two-Tiered Inverted Index with Correctness Guarantee » Proceedings of the 30th annual international ACM SIGIR, Amsterdam 2007) A.
ON INCENTIVE-BASED TAGGING Xuan S. Yang, Reynold Cheng, Luyi Mo, Ben Kao, David W. Cheung {xyang2, ckcheng, lymo, kao, The University.
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Date: 2013/8/27 Author: Shinya Tanaka, Adam Jatowt, Makoto P. Kato, Katsumi Tanaka Source: WSDM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Estimating.
O NTOLOGY E VALUATION AND R ANKING USING O NTO QA By. Samir Tatir and I.Budak Arpinar Department of Industrial Engineering Park Jihye.
ON THE SELECTION OF TAGS FOR TAG CLOUDS (WSDM11) Advisor: Dr. Koh. Jia-Ling Speaker: Chiang, Guang-ting Date:2011/06/20 1.
From Social Bookmarking to Social Summarization: An Experiment in Community-Based Summary Generation Oisin Boydell, Barry Smyth Adaptive Information Cluster,
Let's play “tag”. what is a tag? A tag is a keyword or descriptive term associated with an item as means of classification by means of a folksonomy...
NTU Natural Language Processing Lab. 1 An Analysis of Effectiveness of Tagging in Blogs Christopher H. Brooks and Nancy Montanez University of San Francisco.
Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison.
1 Helping Editors Choose Better Seed Sets for Entity Set Expansion Vishnu Vyas, Patrick Pantel, Eric Crestan CIKM ’ 09 Speaker: Hsin-Lan, Wang Date: 2010/05/10.
Center for E-Business Technology Seoul National University Seoul, Korea Social Ranking: Uncovering Relevant Content Using Tag-based Recommender Systems.
Marina Drosou, Evaggelia Pitoura Computer Science Department
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Tag Clouds Presented By: Laura F. Bright February 27th, 2006 INF385T: Semantic Web Spring 2006 / Turnbull.
Thesis Proposal: Prediction of popular social annotations Abon.
Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.
" Ayesha Akbar Shafia Imtiaz Amal Faisal Omer Bin Asad.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.
The folksonomy tag cloud: when is it useful? James Sinclair and Michael Cardew-Hall Department of Engineering, The Australian National University, Canberra,
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
+ User-induced Links in Collaborative Tagging Systems Ching-man Au Yeung, Nicholas Gibbins, Nigel Shadbolt CIKM’09 Speaker: Nonhlanhla Shongwe 18 January.
UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.
1 Data Mining: Text Mining. 2 Information Retrieval Techniques Index Terms (Attribute) Selection: Stop list Word stem Index terms weighting methods Terms.
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
A Supervised Machine Learning Algorithm for Research Articles Leonidas Akritidis, Panayiotis Bozanis Dept. of Computer & Communication Engineering, University.
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:
Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.
Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.
Job Clouds Presented by: Laura Bright and Brian Lewis May 1st, 2006 Semantic Web / INF 385T.
Finding similar items by leveraging social tag clouds Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: SAC 2012’ Date: October 4, 2012.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore.
Search Engine Optimization
Information Retrieval in Practice
Neighborhood - based Tag Prediction
Preface to the special issue on context-aware recommender systems
Personalized Social Image Recommendation
Summary Presented by : Aishwarya Deep Shukla
Martin Rajman, Martin Vesely
Chapter 2 Database Environment.
Presentation transcript:

By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik

Paper 1 The Structure of Collaborative Tagging Systems Authors : Golder, S. and Huberman, B.,2005.

Contents What is tagging? Tagging & Taxonomy Aspects of Classification Kinds of Tags Case Study : Del.icio.us

What is Tagging? Marking the content with descriptive terms Examples : Catalog indexing by Librarian Keywords to describe a blog entry / Photo on web Collaborative tagging : practice of allowing anyone to freely attach keywords or tags to content Social Bookmark Managers: Del.icio.us ( Flickr ( CiteULike( Cloudalicious (

Tagging & Taxonomy Tagging Non-hierarchical Describe the information held within them Tag based search returns great variety of things simultaneously For example : the Tags for the article about cats in Africa could be cats, africa, animals, cheetahs etc. Taxonomy Hierarchical For example : the Taxonomy for the article about cats in Africa could be

Aspects of Classification Problems to be considered while classifying Semantic Polysemy Synonymy Cognitive Basic level variation Sense making

Kinds of Tags Several kinds of functions performed by tags for bookmarks Identifying What (or Who) it is About Identifying What it Is Identifying Who Owns It Identifying Qualities or Characteristics Self Reference Task Organizing

Case Study : Del.icio.us Del.icio.us Collaborative tagging system for web Social bookmark manager Storage of personal bookmarks Public nature of bookmarks

Case Study : Del.icio.us

Paper 2 On the Selection of Tags for Tag Clouds Authors : P. Venetis, et. al., WSDM, 2011.

Contents Tag Cloud System Model Properties of Tag Cloud Algorithms to generate Tag Clouds User Models for Tag Clouds Experimental Evaluation of algorithms Evaluation of User Models Conclusion

Tag Cloud Definition A visual representation of social tags, organized into paragraph - style layout, usually in alphabetical order, where the relative size and weight of the font for each tag corresponds to the relative frequency of its use. Compact Three dimension at a time! alphabetical order size indicating importance the tags themselves

Tag Cloud Tag cloud for our example “cats in africa”

Tag Cloud Uses of Tag Cloud Summarizing web search results Summarizing results over biomedical databases Summarizing results of structured queries

Tag Cloud Example of tag cloud for summarizing web search results

System Model Terminologies C = set of objects (e.g. web pages / articles) T = set of tags C q = set of objects for query q |C q | = number of objects in C q T q = set of tags for query q A q (t) = Association set for V tag t T q,there is c C q S = set of tags in tag cloud T q |S| = number of tags in tag cloud Partial (scoring) function s(t,c) : T x C [0,1] Similarity function Sim(.,.) : C x C [0,1]

Properties of Tag Cloud Extent of S The cardinality of S ext(s) = |s| Coverage of S Scored size of objects associated with S Where |C q | s,q = sum of scores for every c C q

Properties of Tag Cloud Overlap of S The extent of redundancy Cohesiveness of S How closely related the objects in each association set of S are

Properties of Tag Cloud Relevance of S Relevance between tags in S and original query q Popularity of S A tag is more popular if it is associated with many objects in C q.

Properties of Tag Cloud Independence of S Tags are Independent if they refer to dissimilar objects Balance of S Ratio of minimum size of Association set to the maximum size of Association set for a particular tag in a Tag cloud S.

Algos to generate Tag Clouds Single vs Multi-objective tag selection E.g. achieving high popularity, get more coverage, be more cohesive, Incorporating relevance Input to algorithms C q, T q and S ⊆ T q

Algos to generate Tag Clouds Popularity algorithm(POP) The most common algorithm in social information sharing A tag is more popular if it is associated with many objects in C q. It allows user to see what other people are mostly interested in sharing. For query q and parameter k, the algo returns top k tags in T q according to their |A q (t)|.

Tf-idf based algorithms( TF,WTF ) f (q, t, c) = s(t, c) (tf-idf method) f (q, t, c) = s(t, c).s(q, c) (weighted-idf or WTF method)

Maximum Coverage Algorithm(COV)

User Models for Tag Clouds Build an Ideal user satisfaction model Use this model to compare the tag clouds Base model: Coverage The probability that an object is of the user’s interest is r.p, while the probability that an object is of the user’s interest is p.

User Models for Tag Clouds Incorporating Relevance For an object the probability that it is of the user’s interest is and for every object the probability that it is of the user’s interest is p. Incorporating Cohesiveness For an object the probability that it is of the user’s interest is and for every object the probability that it is of the user’s interest is p.

User Models for Tag Clouds Incorporating Overlap For an object c that is contained by and no other association sets the probability that it is of the user’s interest is the one that can be seen in and for every object the probability that it is of the user’s interest is p. Taking into account Scores Closing Comment

Experimental Evaluation Datasets: CourseRank Del.icio.us

Experimental Evaluation of algorithms: CourseRank Most metrics are not correlated Only coverage and popularity correlated High coverage might not be highly relevant Algorithms impact metrics differently

Experimental Evaluation of algorithms : CourseRank

Experimental Evaluation of algorithms : del.icio.us Similar, but overall range of values for coverage metric is around , much lower than for CourseRank dataset

Impact on failure probability Algorithms impact failure probability differently

Evaluation of User Models 80% predicted correctly, even when failure probability small 100% for difference, so if agreement, we get best tag cloud !

Conclusion Metrics generally not correlated So, different important aspects of tag cloud are covered. COV best algorithm to find tag cloud followed by POP POP works well with relevance and cohesiveness! User model- useful tool to identify tag clouds preferred by users

Future Work Extend model to capture balance metric Construct algorithm to minimize failure probability for a dataset and given extent Take into account items with unassigned and spam tags

Thank you!