Presentation is loading. Please wait.

Presentation is loading. Please wait.

Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize.

Similar presentations


Presentation on theme: "Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize."— Presentation transcript:

1 Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize web search Kay-Uwe Schmidt*, Tobias Sarnow*, Ljiljana Stojanovic** *SAP Research, Vincenz-Prießnitz-Straße 1, 76131 Karlsruhe, Germany **Forschungszentrum Informatik, Haid-und-Neu-Straße 10-14, 76131 Karlsruhe, Germany Symposium on Applied Computing (2009) 2009. 08. 13. Summarized & presented by Babar Tareen, IDS Lab., Seoul National University

2 Copyright  2008 by CEBT Introduction  Search engines do not consider current work context  Static results for all users  Server side personalization has limited use  Client side search engines rely on additional terms extracted from documents, thus not scalable  Social Bookmarking based search result personalization addresses these issues 2

3 Copyright  2008 by CEBT Related Work  Google History  goZone.com  Mahalo.com  UCAIR 3

4 Copyright  2008 by CEBT Motivation 4  A developer is looking for guide lines for testing DB code  Visits www.ibm.com/db2 www.hsqldb.org  Googles “Test”  Original Results Web based certification Personality test Bandwidth test  Personalized Results DB2 training DB2 programming test

5 Copyright  2008 by CEBT Personalizing Search Results  Tracking browsing behavior  Create user model Url’s Tags fetched from Delicious  Issue original query  Enhance search query by adding tags  Issue new query  Display both results Tags given by a community of users provide a good summary of web page content 5 UrlTags (Metadata) www.youtube.comvideo, youtube, entertainment, web2.0 www.amazon.comshopping, books, amazon, music www.snu.ac.kruniversity, snu, korea, 서울대 www.hsqldb.orgdatabase, java, sql, opensource www.ibm.com/db2ibm, db2, database, unix

6 Copyright  2008 by CEBT Architecture [1] 6  Search Module Carries out original query Inserts space ( ) for personalized results  Metric Module Includes a metric that delivers a tag for personalized search  Search Enhancer Module Combines search string with metric module tags  Metadata Module Extracts metadata for a visited website from delicious

7 Copyright  2008 by CEBT Architecture [2]  Built as add-on on top of Firefox Internet Explorer 7

8 Copyright  2008 by CEBT Metric [1]  Two datasets Collection of visited websites Tags for each website  Query last 20 disjunct websites from user model Format (url, count) Sorted by weight ‘γ’ 8

9 Copyright  2008 by CEBT Metric [2]  Tags assigned to website Format (tag, no of users) t → tags assigned to a website T → tags for all websites 9

10 Copyright  2008 by CEBT Algorithm 10

11 Copyright  2008 by CEBT Result 11

12 Copyright  2008 by CEBT Evaluation How effective can this be ? 12

13 Center for E-Business Technology Seoul National University Seoul, Korea 13 Can Social Bookmarking Improve Web Search? Pauly Heymann, Georgia Koutrika, Hector Garcia-Molina Dept. of Computer Science, Stanford University USA Web Search and Data Mining 2008

14 Copyright  2008 by CEBT Positive Factors [1]  URLs Pages posted on delicious are often recently modified – Delicious users post interesting pages that are actively updated or have been recently created Approximately 25% of URLs posted by users are new, unindexed pages – Delicious can server as a small data source for new web pages and to help crawl ordering Roughly 9% of results for search queries are URLs present in delicious – Delicious URLs are disproportionately common in search results compared to their coverage While some users are more prolific than others, the top 10% of users only account for 56% of the posts – Delicious is not highly reliant on a relatively small group of users 14

15 Copyright  2008 by CEBT Positive Factors [2]  URLs 30-40% of URLs and approximately one in eight domains posted were not previously in delicious. – Delicious has relatively little redundancy in page information  Tags Popular query terms and tags overlap significantly – Delicious may be able to help with queries where tags overlap with query terms In this study, most tags were deemed relevant and objective by users – Tags are on the whole accurate 15

16 Copyright  2008 by CEBT Negative Factors  URLs Approximately 120,000 URLs are posted to delicious each day – The number of posts per day is relatively small; for instance, it represents 1/10 of the number of blog posts per day There are roughly 115 million public posts, coinciding with about 30-50 million unique URLs – The number of total posts is relatively small for instance, this is a small portion of the web as whole (perhaps 1/1000)  Tags Tags are present in the pagetext of 50% of the pages they annotate – A substantial proportion of tags are obvious in context, and many tagged pages would be discovered by a search engine Domains are often highly correlated with particular tags and vice versa – It may be more efficient to train librarians to label domains than to ask users to tag pages 16

17 Copyright  2008 by CEBT Discussion  Query expansion model based on Social tagging  What is the probability of finding tags for random URL in delicious.com?  Generalization vs. Specialization 17


Download ppt "Center for E-Business Technology Seoul National University Seoul, Korea Socially Filtered Web Search: An approach using social bookmarking tags to personalize."

Similar presentations


Ads by Google