Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison.

Slides:

Advertisements

Similar presentations

MOIS 508 Dr. Dina Rateb By: Dina EL Touby ID:

Advertisements

Ziv Bar-YossefMaxim Gurevich Google and Technion Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA.

1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.

Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize.

Natural Language Processing WEB SEARCH ENGINES August, 2002.

Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.

Explorations in Tag Suggestion and Query Expansion Jian Wang and Brian D. Davison Lehigh University, USA SSM 2008 (Workshop on Search in Social Media)

Evaluating Search Engine

Best Web Directories and Search Engines Order Out of Chaos on the World Wide Web.

Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.

INFO 624 Week 3 Retrieval System Evaluation

Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.

© Tefko Saracevic1 Search strategy & tactics Governed by effectiveness&feedback.

Searching The Web Search Engines are computer programs (variously called robots, crawlers, spiders, worms) that automatically visit Web sites and, starting.

1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.

The Wharton School of the University of Pennsylvania OPIM 101 2/16/19981 The Information Retrieval Problem n The IR problem is very hard n Why? Many reasons,

Web Search – Summer Term 2006 II. Information Retrieval (Basics Cont.) (c) Wolfgang Hürst, Albert-Ludwigs-University.

Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.

Chapter 5: Information Retrieval and Web Search

An introduction to databases In this module, you will learn: What exactly a database is How a database differs from an internet search engine How to find.

Search Engine Optimization

Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα

Tag-based Social Interest Discovery

Evaluation David Kauchak cs458 Fall 2012 adapted from:

Evaluation David Kauchak cs160 Fall 2009 adapted from:

Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.

Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?

A Comparative Study of Search Result Diversification Methods Wei Zheng and Hui Fang University of Delaware, Newark DE 19716, USA

©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.

Searching the Web Dr. Frank McCown Intro to Web Science Harding University This work is licensed under Creative Commons Attribution-NonCommercial 3.0Attribution-NonCommercial.

Dr. Susan Gauch When is a rock not a rock? Conceptual Approaches to Personalized Search and Recommendations Nov. 8, 2011 TResNet.

A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.

1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)

PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.

Tag Data and Personalized Information Retrieval 1.

Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.

Query Routing in Peer-to-Peer Web Search Engine Speaker: Pavel Serdyukov Supervisors: Gerhard Weikum Christian Zimmer Matthias Bender International Max.

Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.

Web Searching Basics Dr. Dania Bilal IS 530 Fall 2009.

UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.

When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.

Giorgos Giannopoulos (IMIS/”Athena” R.C and NTU Athens, Greece) Theodore Dalamagas (IMIS/”Athena” R.C., Greece) Timos Sellis (IMIS/”Athena” R.C and NTU.

Implicit User Feedback Hongning Wang Explicit relevance feedback 2 Updated query Feedback Judgments: d 1 + d 2 - d 3 + … d k -... Query User judgment.

XP New Perspectives on The Internet, Sixth Edition— Comprehensive Tutorial 3 1 Searching the Web Using Search Engines and Directories Effectively Tutorial.

Chapter 6: Information Retrieval and Web Search

The Anatomy of a Large-Scale Hyper textual Web Search Engine S. Brin, L. Page Presenter :- Abhishek Taneja.

Searching the web Enormous amount of information –In 1994, 100 thousand pages indexed –In 1997, 100 million pages indexed –In June, 2000, 500 million pages.

4 1 SEARCHING THE WEB Using Search Engines and Directories Effectively New Perspectives on THE INTERNET.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.

Adish Singla, Microsoft Bing Ryen W. White, Microsoft Research Jeff Huang, University of Washington.

Searching the World Wide Web: Meta Crawlers vs. Single Search Engines By: Voris Tejada.

Retroactive Answering of Search Queries Beverly Yang Glen Jeh.

Automatic Video Tagging using Content Redundancy Stefan Siersdorfer 1, Jose San Pedro 2, Mark Sanderson 2 1 L3S Research Center, Germany 2 University of.

Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.

Information Retrieval

Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.

The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.

Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.

Chapter. 3: Retrieval Evaluation 1/2/2016Dr. Almetwally Mostafa 1.

Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:

To Personalize or Not to Personalize: Modeling Queries with Variation in User Intent Presented by Jaime Teevan, Susan T. Dumais, Daniel J. Liebling Microsoft.

Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.

Query Type Classification for Web Document Retrieval In-Ho Kang, GilChang Kim KAIST SIGIR 2003.

SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.

Internet Searching: Finding Quality Information

Multimedia Information Retrieval

Data Mining Chapter 6 Search Engines

Introduction to Information Retrieval

Presentation transcript:

Information Retrieval Effectiveness of Folksonomies on the World Wide Web P. Jason Morrison

Information retrieval (IR) on the Web Traditionally, there are 2 options: 1. Search Engines – documents added to collection automatically, full text searching using some algorithm; 2. Subject Directories – documents collected and organized into a hierarchy or taxonomy by experts. Many sites now use a new system: 3. Folksonomies – documents collected and tagged with keywords by all users, brought together into a loose organizational system.

Folksonomies Very little empirical study has been done on Folksonomies. Used by social bookmarking sites like Del.icio.us, photography sites like Flickr, and video sites like YouTube. Even large, established retailers like Amazon are starting to experiment with tagging.

Research Questions: 1. Do web sites that employ folksonomies return relevant results to users performing information retrieval tasks, specifically searching? 2. Do folksonomies perform as well as subject directories and search engines?

Hypotheses: 1. Despite different index sizes and categorization strategies, the top results from search engines, directories, and folksonomies will show some overlap. Items that show up in the results of more than one will be more likely to be judged. 2. There will be significant difference between the IR effectiveness of search engines, expert-maintained directories, and folksonomies. 3. Folksonomies will perform as well or better than search engines and directories for information needs that fall into entertainment or current event categories. They will perform less well for factual or specific-document searches.

Gordon and Pathak’s (1999) Seven Features: 1. Searches should use real information needs 2. Studies should try to capture the information need, not just the query used, if possible 3. A large enough number of searches must be done to do a meaningful evaluation. 4. Most major search engines should be included 5. The special features of each engine should be utilized. 6. Relevance should be judged by the person with the information need.

Gordon and Pathak’s Seven Features, cont: 7) Experiments need to be conducted so they provide meaningful measures: Good experimental design, such as returning results in a random order; Use of accepted IR measurements like Recall and Precision; Use of appropriate statistical tests.

Hawking, et al.’s (2001) additional feature: 8) Search topics should include different types of information needs Four different types based on the desired results: 1. A short factual statement that directly answers a question; 2. A specific document or web site that the user knows or suspects exists; 3. A selection of documents that pertain to an area of interest; or 4. An exhaustive list of every document that meets their need. (

Leighton and Srivastava (1997) Gordon and Pathak (1999) Hawking et al (2001)Can et al (2003) The Present Study Information Needs Provided by Library reference desk, other studies Faculty membersQueries from web logsComputer Science Students and Professors Graduate students Queries Created by The researchers Skilled searchersQueries from web logsSame Relevance Judged by The researchers (by consensus) Same faculty members Research AssistantsSame Participants233 Faculty members Total queries

Leighton and Srivastava (1997) Gordon and Pathak (1999) Hawking et al (2001)Can et al (2003) The Present Study Engines tested Results evaluated per engine 20 Total results evaluated / evaluator: or 320About 160 Relevancy Scale 4 categories4-point scaleBinary Precision Measures: P(20), weighted groups by rank P(1-5), P(1-10), P(5-10), P(15-20) P(1), P(1-5), P(5) P(20) P(10), P(20) P(20), P(1-5) Recall Measures: noneRelative recall; R(15-20), R(15-25), R(40-60), R( ), R( ) noneRelative recall: R(10), R(20) Relative recall: R(20), R(1-5)

IR systems studied Two directories: Open Directory and Yahoo. Three search engines: Alta Vista, Live (Microsoft), and Google. Three social bookmarking systems representing the folksonomies: Del.icio.us, Furl, and Reddit.

General results 34 users, 103 queries and 9266 total results returned. The queries generated by participants were generally similar to previous studies in terms of word count and use of operators. Previous studies of search engine logs have shown that users rarely try multiple searches and rarely look past the first set or results. This fits the current study. For many queries, some IR systems did not return the full 20 results. In fact there were many queries where some IR systems returning 0 results.

Hypothesis 1: Overlap in results Number of engines returning the URL Number of unique results Relevancy rateSD Total

IR system type combination Engine types returning same URLNMean DirectoryFolksonomySearch Engine no yes noyesno yesno noyes yesnoyes yes no yes Total

Overlap of results findings Almost 90% of results were returned by just one engine – fits well with previous studies. Results found by both search engines and folksonomies were significantly more likely to be relevant The directory/search engine group had a higher relevancy rate than the folksonomy/search engine group, but the difference was not significant. Allowing tagging or meta-searching a folksonomy could improve search engine performance. Hypothesis 1 is supported.

Hypothesis 2: Performance differences Performance measures: Precision Relative Recall Retrieval Rate also calculated

Performance (dcv 20) IR SystemPrecisionRecallRetrieval Rate Open DirectoryMean N Yahoo DirectoryMean N Del.icio.usMean N FurlMean N RedditMean N GoogleMean N LiveMean N Alta VistaMean N TotalMean N

Precision at positions 1-20

Recall at positions 1-20

Average performance at dcv 1-5 IR System TypeAvg PrecisionAvg Recall Avg Retrieval Rate DirectoryMean N FolksonomyMean N Search EngineMean N

Performance differences findings There are statistically significant differences among individual IR systems and IR system types. Search engines had the best performance by all measures. In general directories had better precision than folksonomies, but difference not usually statistically significant. Del.icio.us performed as well or better than the directories. Hypothesis 2 is supported.

Hypothesis 3: Performance for different needs Do Folksonomies perform better than the other IR systems for some information needs, and worse for others?

Comparing information need categories Info Need Category IR System Type Avg Precision Avg RecallAvg Retrieval Short Factual Answer DirectoryMean N1228 FolksonomyMean N2842 Search Engine Mean N4042 Specific Item DirectoryMean N FolksonomyMean N Search Engine Mean N Selection of Relevant Items DirectoryMean N FolksonomyMean N Search Engine Mean N

News and entertainment searches Information Need IR System Type Avg Precision Avg RecallRetrieval Rate NewsDirectory Mean N44042 Folksonomy Mean N Search Engine Mean N EntertainmentDirectory Mean N61618 Folksonomy Mean N Search Engine Mean N252427

Factual and exact site searches Information Need IR System Type Avg Precision Avg RecallRetrieval Rate FactualDirectory Mean N1228 Folksonomy Mean N2842 Search Engine Mean N4042 Exact SiteDirectory Mean N Folksonomy Mean N Search Engine Mean N615763

Performance for different info needs findings Significant differences were found among folksonomies, search engines, and directories for the three info need categories. When comparing within info need categories, the search engines had significantly better precision. Recalls scores were similar but not significant. Folksonomies did not perform significantly better for news and entertainment searches; but They did perform significantly worse than search engines for factual and exact site searches. Hypothesis 3 only partly supported.

What other factors impacted performance? For the study as a whole, the use of query operators correlated negatively with recall and retrieval rate. Non-boolean operators correlated negatively with precision scores. When looking at just folksonomy searches, query operator use lead to even lower recall and retrieval scores. Some specific cases were not handled by the folksonomies. A search for movie show times at a certain zip code (“showtimes borat”) had zero results on all folksonomies. Queries that were limited by geography and queries with obscure topics can perform poorly in folksonomies because users might not have added/tagged items yet.

User factors For the most part, user experience did not correlate significantly with performance measures. Expert users were more likely to have lower precision scores. Same correlation found when correcting for query factors Experienced users probably less likely to deem something relevant.

Recommendations Further research is needed Additional folksonomies should be studied as well. It might be useful to collect additional types of data, such as whether or not participants clicked through to look at sites before judging. Additional analysis on ranking would be interesting. Any similar study must also deal with difficult technical issues like server and browser timeouts.

Conclusions The overlap between folksonomy results and search engine results could be used to improve Web IR performance. The search engines, with their much larger collections, performed better than directories and folksonomies in almost every case. Folksonomies may be better than directories for some needs, but more data is required. Folksonomies are particularly bad at finding a factual answer or one specific site.

Conclusions (cont.) Although search engines had better performance across the board, folksonomies are promising because: 1.They are relatively new and may improve with time and additional users; 2.Search results could be improved with relatively small changes to the way query operators and search terms are used. 3.There are many variations in organization to be tried.

Future research Look at the difference between systems that primarily use tagging (Del.icio.us, Furl) and those that use ranking (Reddit, Digg) Which variations are more successful? Tags, titles, categories, descriptions, comments, and even full text are collected by various folksonomies. Where should weight be placed? Should a document that matches the query closely rank higher than one with many votes, or vice versa?

Future research (cont.) Artificial situations could be set up to study absolute recall and searches for an exhaustive list of items. Similar studies on IR systems covering smaller domains, like video, should be done. Blog search systems in particular would be interesting. What about other IR behaviors such as browsing? There are many other fascinating topics such as the social networks in some folksonomies and what motivates users to tag items among others.