Center for E-Business Technology Seoul National University Seoul, Korea Social Ranking: Uncovering Relevant Content Using Tag-based Recommender Systems.

Slides:

Advertisements

Similar presentations

Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize.

Advertisements

Experiments on Query Expansion for Internet Yellow Page Services Using Log Mining Summarized by Dongmin Shin Presented by Dongmin Shin User Log Analysis.

Evaluating Search Engine

Search Engines and Information Retrieval

Tagging Systems Austin Wester. Tags A keywords linked to a resource (image, video, web page, blog, etc) by users without using a controlled vocabulary.

Tagging Systems Mustafa Kilavuz. Tags A tag is a keyword added to an internet resource (web page, image, video) by users without relying on a controlled.

Del.icio.us Bill G. Kelm IDS 150: Research in the Information Age April 3, 2007.

Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.

A survey of tag cloud presentation techniques Mogens Nielsen June 6th 2007.

Chapter 5 Searching for Truth: Locating Information on the WWW.

Recommender systems Ram Akella November 26 th 2008.

Information Retrieval

| Computer Science Department | Ubiquitous Knowledge Processing Lab | © Prof. Dr. Iryna Gurevych | 1 del.icio.us Knowledge Management in Web.

Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.

1 SOCIAL BOOKMARKING 101. HIBA KHALID BILAL SAEED KHAN FARID ALIANI ASKARI HASAN SOCIAL BOOKMARKING.

Center for E-Business Technology Seoul National University Seoul, Korea Social Network Collaborative Filtering Research Meeting Babar Tareen

+ Social Bookmarking and Collaborative Filtering Christopher G. Wagner.

FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.

Center for E-Business Technology Seoul National University Seoul, Korea Collaborative joins in a pervasive computing environment Filip Perich, Anupam Joshi,

A Survey on Context-Aware Computing Center for E-Business Technology Seoul National University Seoul, Korea 이상근, 이동주, 강승석, Babar Tareen Intelligent Database.

Golder and Huberman, 2006 Journal of Information Science Usage Patterns of Collaborative Tagging System.

Web 2.0: Concepts and Applications 4 Organizing Information.

Search Engines and Information Retrieval Chapter 1.

Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.

An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.

Growing a Tree in the Forest: Constructing Folksonomies by Integrating Structured Metadata Anon Plangprasopchok 1, Kristina Lerman 1, Lise Getoor 2 1 USC.

HENRY FORD “If I’d asked my customers what they wanted, they would’ve said a faster horse.”

By : Garima Indurkhya Jay Parikh Shraddha Herlekar Vikrant Naik.

Mining Interesting Locations and Travel Sequences from GPS Trajectories IDB & IDS Lab. Seminar Summer 2009 강 민 석강 민 석 July 23 rd,

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

Andriy Shepitsen, Jonathan Gemmell, Bamshad Mobasher, and Robin Burke

No Title, yet Hyunwoo Kim SNU IDB Lab. September 11, 2008.

Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.

Tag Data and Personalized Information Retrieval 1.

UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.

Search - on the Web and Locally Related directly to Web Search Engines: Part 1 and Part 2. IEEE Computer. June & August 2006.

Center for E-Business Technology Seoul National University Seoul, Korea BrowseRank: letting the web users vote for page importance Yuting Liu, Bin Gao,

29-30 October, 2006, Estonia 1 IST4Balt Information analysis using social bookmarking and other tools IST4Balt Information analysis using social bookmarking.

EASE: An Effective 3-in-1 Keyword Search Method for Unstructured, Semi-structured and Structured Data Cuoliang Li, Beng Chin Ooi, Jianhua Feng, Jianyong.

Web Personalization Based on Static Information and Dynamic User Behavior Center for E-Business Technology Seoul National University Seoul, Korea Nam,

From Social Bookmarking to Social Summarization: An Experiment in Community-Based Summary Generation Oisin Boydell, Barry Smyth Adaptive Information Cluster,

A Collaborative and Semantic Data Management Framework for Ubiquitous Computing Environment International Conference of Embedded and Ubiquitous Computing.

1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.

Let's play “tag”. what is a tag? A tag is a keyword or descriptive term associated with an item as means of classification by means of a folksonomy...

NTU Natural Language Processing Lab. 1 An Analysis of Effectiveness of Tagging in Blogs Christopher H. Brooks and Nancy Montanez University of San Francisco.

Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.

Center for E-Business Technology Seoul National University Seoul, Korea Freebase: A Collaboratively Created Graph Database For Structuring Human Knowledge.

Early Profile Pruning on XML-aware Publish- Subscribe Systems Mirella M. Moro, Petko Bakalov, Vassilis J. Tsotras University of California VLDB 2007 Presented.

©Copyright RPlus Corporation. All Rights Reserved. Education Matters Welcome ! More Coming Soon….

Flickr Tag Recommendation based on Collective Knowledge BÖrkur SigurbjÖnsson, Roelof van Zwol Yahoo! Research WWW Summarized and presented.

Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia- Molina Stanford University SIGIR 2008.

Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.

Enhancing Web Search by Promoting Multiple Search Engine Use Ryen W. W., Matthew R. Mikhail B. (Microsoft Research) Allison P. H (Rice University) SIGIR.

Searching for the Best Engine Presented by Gong GI Hyun, IDS Lab., Seoul National University.

Semantic Grounding of Tag Relatedness in Social Bookmarking Systems Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme ISWC 2008 Hyewon Lim January.

Online Evolutionary Collaborative Filtering RECSYS 2010 Intelligent Database Systems Lab. School of Computer Science & Engineering Seoul National University.

Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:

Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.

Tag File System in Cloud 林敬棋 NTU CSIE D Research Statement This project aims at adding tags to the files in the cloud storage. ◦ A tag is a keyword.

Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.

Mining Tag Semantics for Social Tag Recommendation Hsin-Chang Yang Department of Information Management National University of Kaohsiung.

Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.

Social Tag Prediction Paul Heymann, Daniel Ramage, and Hector Garcia-Molina Department of Computer Science Stanford University SIGIR 2008 Presentation.

Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,

Neighborhood - based Tag Prediction

Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance Hello everyone,

Information Integration for Digital Libraries

Introduction to Information Retrieval

Presentation transcript:

Center for E-Business Technology Seoul National University Seoul, Korea Social Ranking: Uncovering Relevant Content Using Tag-based Recommender Systems Valentina Zanardi, Licia Capra Dept. of Computer Science, University College London 2 nd ACM International Conference on Recommender Systems October 23-25, 2008, Lausanne, Switzerland Summarized & presented by Babar Tareen, IDS Lab., Seoul National University

Copyright  2008 by CEBT Introduction  Taxonomies Hierarchical classification Standardized Expert opinions  Social (or folksonomic) Tagging enhance content by enabling users to Describe Categories Search Discover Navigate (Tag Clouds) 2

Copyright  2008 by CEBT Introduction (2)  At times, use of tagging may lower search efficiency  Downsides of Social Tagging Informally defined Dynamically Changing Ungoverned Heterogeneity of users Heterogeneity of context  Language related problems Synonyms: Words with similar meaning – Book (Schedule, Reserve, Record) Homonyms: Words with same pronunciation but different meaning – Berry (Fruit), Bury (take under) Polysemy: Words having different meanings – Foot (Length, Body Part) – Left (Direction, Action of leaving a place) 3

Copyright  2008 by CEBT Social Ranking (In a Nutshell)  Aims to efficiently find content that is relevant to a user’s query Assumptions – Typical Web 2.0 content – Content is arbitrarily Tagged by users  Answers queries by exploiting recommender system techniques User similarity is based on past tag activity Tag relationship based on association to content Ranking by – Inferred distance of the query to the tags associated to such content – Weighted by the similarity of the querying user to the user who created those tags 4

Copyright  2008 by CEBT Dataset Analysis  CiteULike dataset (Social Bookmarking site for researchers) Article, User, Tag 820,000 Articles (papers) 28,000 Users 240,000 Tags  Pre-Processing Removed Bookmarks and Tags used by only one users 100,000 Articles (papers) 28,000 Users 55,000 Tags 5

Copyright  2008 by CEBT Long Tails  Long Tail of Tags 70% of the tags used by 20 users On Avg. 5 Tags per paper (Max. 10) This suggests that standard keyword search will likely fail  Long Tails of Papers 85% of the papers tagged by less than 5 users This suggests that standard recommender systems techniques would likely perform poorly in terms of accuracy and coverage 6

Copyright  2008 by CEBT Ranking (Basic Model)  The higher the number of query tags associated to the resource, the higher its ranking (Accuracy)  The higher the number of users u i who tagged the resource using (some of the) query tags, the higher its ranking  Works fine for popular content  Fails to address queries that look for long tail of medium-to-low popularity content (Accuracy Problem)  If user running the query also uses tags that belong to long tail of tags then chances are that relevant content is not found (Coverage Problem) 7

Copyright  2008 by CEBT Social Ranking  Based on following observation Clustering of Users for Improved Accuracy – Most active users bookmark a tiny portion of the whole paper set – Users have clear defined interests – Each users masters small subset of the whole folksonomy – Users sharing parts of folksonomy form fairly small clusters Clustering of Tags for Improved Coverage – Each paper was described by just a handful of tags – This suggests that there is a core of shared knowledge about tags within communities 8

Copyright  2008 by CEBT Social Ranking (2)  Identify the users with similar interests to querying user Based on users’ tag activity  Identifying similar tags to query tags 9

Copyright  2008 by CEBT Two Step Query Model  Query  Query Expansion  Ranking Papers with tags from original query should rank higher than extra tags from expanded query Papers shared by similar user should be ranked higher 10

Copyright  2008 by CEBT Evaluation  Dataset Considered only those tags – Which are used on at least 15 different papers – By at least 20 different users Users: 12,000 Papers: 83,000 Tags: 16,000  Long Tails 11

Copyright  2008 by CEBT Simulation Setup 12 Q

Copyright  2008 by CEBT Results (without query expansion) 13

Copyright  2008 by CEBT Results (2) 14

Copyright  2008 by CEBT Results (3) 15

Copyright  2008 by CEBT Discussion  Paper assumes that users have fixed interests Work for CiteULike because many people will have limited research directions May not work well enough for Delicious because people tend to bookmark different types of pages  Tags in CiteULike may be comparatively well organized because of technical users adding tags to technical papers Maximum tags per paper on CiteULike: 10 May not work well enough for Delicious, some bookmarks with 46 tags 16