Personalised Search on the World Wide Web Presented by: Team Grape.

Slides:



Advertisements
Similar presentations
Taxonomy & Ontology Impact on Search Infrastructure John R. McGrath Sr. Director, Fast Search & Transfer.
Advertisements

Web Mining.
Recommender Systems & Collaborative Filtering
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Web Mining Research: A Survey Authors: Raymond Kosala & Hendrik Blockeel Presenter: Ryan Patterson April 23rd 2014 CS332 Data Mining pg 01.
Information Retrieval in Practice
Personalizing Search via Automated Analysis of Interests and Activities Jaime Teevan Susan T.Dumains Eric Horvitz MIT,CSAILMicrosoft Researcher Microsoft.
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Mobile Web Search Personalization Kapil Goenka. Outline Introduction & Background Methodology Evaluation Future Work Conclusion.
Personalised Search on the World Wide Web Originally by Micarelli, Gasparetti, Sciarrone & Gauch
Web Mining Research: A Survey
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
Web Mining Research: A Survey
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
University of Kansas Department of Electrical Engineering and Computer Science Dr. Susan Gauch April 2005 I T T C Dr. Susan Gauch Personalized Search Based.
Information Retrieval
12 -1 Lecture 12 User Modeling Topics –Basics –Example User Model –Construction of User Models –Updating of User Models –Applications.
Personalized Ontologies for Web Search and Caching Susan Gauch Information and Telecommunications Technology Center Electrical Engineering and Computer.
Overview of Web Data Mining and Applications Part I
Overview of Search Engines
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
The 2nd International Conference of e-Learning and Distance Education, 21 to 23 February 2011, Riyadh, Saudi Arabia Prof. Dr. Torky Sultan Faculty of Computers.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Personalization in Local Search Personalization of Content Ranking in the Context of Local Search Philip O’Brien, Xiao Luo, Tony Abou-Assaleh, Weizheng.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
Gradual Adaption Model for Estimation of User Information Access Behavior J. Chen, R.Y. Shtykh and Q. Jin Graduate School of Human Sciences, Waseda University,
Dr. Susan Gauch When is a rock not a rock? Conceptual Approaches to Personalized Search and Recommendations Nov. 8, 2011 TResNet.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Chapter 2 Architecture of a Search Engine. Search Engine Architecture n A software architecture consists of software components, the interfaces provided.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Chapter Chapter 3 Internet Agents. Chapter Contents Background Web Search Agents Information Filtering Agents Notification Agents Other Service.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
Similar Document Search and Recommendation Vidhya Govindaraju, Krishnan Ramanathan HP Labs, Bangalore, India JOURNAL OF EMERGING TECHNOLOGIES IN WEB INTELLIGENCE.
Personalized Search Xiao Liu
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Collaborative Information Retrieval - Collaborative Filtering systems - Recommender systems - Information Filtering Why do we need CIR? - IR system augmentation.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Search Engines1 Searching the Web Web is vast. Information is scattered around and changing fast. Anyone can publish on the web. Two issues web users have.
Data Mining for Web Intelligence Presentation by Julia Erdman.
Personalized Course Navigation Based on Grey Relational Analysis Han-Ming Lee, Chi-Chun Huang, Tzu- Ting Kao (Dept. of Computer Science and Information.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Personalization with user’s local data Personalizing Search via Automated Analysis of Interests and Activities 1 Sungjick Lee Department of Electrical.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.
Introduction to Information Retrieval Example of information need in the context of the world wide web: “Find all documents containing information on computer.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Information Retrieval
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Data mining in web applications
Information Retrieval in Practice
Recommender Systems & Collaborative Filtering
Search Engine Architecture
Automatic cLasification d
Personalized Social Image Recommendation
User-Adaptive Systems
Submitted By: Usha MIT-876-2K11 M.Tech(3rd Sem) Information Technology
Information Retrieval
Personal Assistants for the Web: An MIT Perspective
Web Mining Department of Computer Science and Engg.
Magnet & /facet Zheng Liang
Web Mining Research: A Survey
Presentation transcript:

Personalised Search on the World Wide Web Presented by: Team Grape

About Team Grape: Jin Wu Kewei Duan Linh Duy To Miaolai Han Takazumi Matsumoto The Paper: Personalized Search on the World Wide Web Alessandro Micarelli Fabio Gasparetti Filippo Sciarrone Susan Gauch The interactive stuff: MOT lesson Grapple lessons: Text only, Depth firstText onlyDepth first

Overview 1.Introduction 2.A Short Overview on Personalised Search 3.Contextualised Search 4.Personalisation Based on Search Histories 5.Personalisation Based on Rich Representations of User Needs 6.Collaborative Search Engines 7.Adaptive Result Clustering 8.Hyperlink-Based Personalisation 9.Combined Approaches to Personalisation 10.Conclusions

Introduction Personalisation adapting the results according to each users information needs (Micarelli et al., 2007, p. 195) Searching the WWW Dealing with the information overload Limitations of traditional search engines Information access paradigms: – Searching by surfing (hyperlink directories) – Searching by query (Information Retrieval) – Recommendation (suggested items)

Content and Collaborative-based Personalisation Originally: information retrieval Content-based: – Consider individuals - mostly used – Polysemy & synonymy leads to vocabulary problem irrelevant information Collaborative-based: – Consider models of different users – User similarity similar information needs – Social navigation – Not employed in search engines

User Modelling in Personalised Systems User modelling/profiling techniques: – Track visited pages & search history important feature learned more relevant information – Simplest cases: registration form or questionnaire – More complex cases: user model consists of a dynamic information structure Examples: – Google Alert: explicit approach & routing query limited – Google Personalized Search: deliver customised search based on user profile User modelling components affect search in 3 distinct phases: – Part of retrieval process – Re-ranking – Query modification

Source of Personalisation Data mining & machine learning Relevant feedback & query expansion – Explicit relevant feedback – Implicit relevant feedback Further sources: desktop search systems

An Overview on Personalisation Approaches Current context: based on implicit feedback using client- based software Search History: – Limited to web search history – Done during retrieval process fast response Rich user models: explicit feedback build rich representation of user needs Collaborative approach: relevant resources based on previous ratings by user with similar tastes & preferences Result clustering: results grouped into clusters, each related to same topic Hyper textual data: include additional factors in ranking algorithm

Contextual Search A new approach for search The information system proactively suggests information based on a persons working context Just-in-Time IR (JITIR) Rhodes

JITIR Monitors the users actions Non-intrusive Automatically identify relevant information Retrieve resources automatically

Based on Agents Remembrance Agent Margin Notes Agent Jimminy Agent

Personalisation Based on Search Histories Visa Citizenship Travel Credit Card Flight

Online Approaches Capture history information as soon as they are available, affecting user models and providing personalised results taking into consideration the last interactions of the user Two different types of information are collected: – submitted queries – snippets

Offline Approaches Exploit history information in a distinct pre- processing step, usually analysing relationships between queries and documents visited by users CubeSVD Algorithm based on the click- through algorithm Time-consuming

Personalisation Based on Rich Representations of User Needs Three prototypes ifWeb, Wifs, InfoWeb Based on complex representations of user needs (user models) Built using explicit user feedback on results Based on frames and semantic networks (AI)

ifWeb User model-based intelligent agent Weighted semantic network for user profile Autonomous focused crawling to find related documents based on previously identified documents Updates user profile using user feedback Reduces the weight of unused concepts (rent)

Wifs Content-based approach Filters HTML and text documents from AltaVista, reordering links based on UM Frame-based user model structure A frame has slots which contains terms (topics), associated with other terms (co-keywords), forming a semantic network The terms are stored in a Terms DataBase that is created beforehand (by experts) Instead of traditional IR, the relevance of a document is calculated from the occurrence and relevance of terms in the document

Wifs Content-based approach Frame-based user model structure A frame has slots which contains terms (topics), associated with other terms (co-keywords), forming a semantic network The terms are stored in a Terms DataBase that is created beforehand (by experts) Filters HTML and text documents from AltaVista, reordering links based on UM Instead of traditional IR, the relevance of a document is calculated from the occurrence and relevance of terms in the document Representations of the User model (a) and Document model (b) (From Micarelli et al., 2007)

InfoWeb Content-based approach Adaptive retrieval of documents in digital libraries, based on Vector Space (IR) Stereotype knowledge base Contains most significant documents for a specific category of user (domain), created beforehand (by an expert) k-means clustering on document collection beforehand Each cluster is seeded by a representative document for each class of user User model starts as a stereotype, evolves based on feedback

Collaborative Search Engine SearchParty module – Social filtering – Stores user queries and the results users clicked Knowledge Sea – Social adaptive navigation system – Exploits both traditional IR and social navigation approaches – Results represented by colour lightness

Collaborative Search Engine Calculate similarity measures among user needs – Identified by queries, selected resources – Two queries might contain no common terms but returns similar results – E.g. PDA and handheld computer Statistical model – Based on the probability a page was selected for a given query – Focus on relative frequency instead of content- analysis techniques

Collaborative Search Engine Compass Filter – Based on web communities – Pre-processing the web structure – If user frequently visit a community, the results in the same community are boosted

Adaptive Result Clustering Traditional Search Engines – Rank the list by similarity of query and page – Might take a long time – Important that users clearly describe what they are looking for Organise the results – By grouping pages into folders and sub folders – On a graphical interactive map

Adaptive Result Clustering Clustering – Query process needs to be fast – Usually performed after retrieval of query results – Does not require pre-defined categories – Provides concise and accurate descriptions Further clustering systems – SnakeT – Scatter / Gather

Main algorithms: PageRank: PR value HubFinder: hub value HubRank: PR value & hub value Hyperlink-Based Personalisation

Combined Approaches to Personalisation Perform personalisation using multiple adaptive approaches Outride: Browsing history & current context infoFACTORY: Integrate web tools & services

Outride Outride includes: Contextualisaion Interrelated conditions that occur within an activity Individualisation Characteristics that distinguish an individual

infoFACTORY A large set of integrated web tools and services that are able to evaluate and classify documents retrieved following a user profile New Has potential Interesting

Conclusions Information is crucial to users Need to filter and personalise resources to deal with information overload successfully Increases search engine accuracy and reduces time wasted sorting through irrelevant results Can be extended e.g. targeted advertising Some systems already in use, others under development (e.g. Semantic Web) Future directions: – Predicting future user behaviour (plan-recognition) – Language semantic analysis (Natural Language Processing)

Thanks for listening Any questions?