Web Information retrieval (Web IR)

Web Information retrieval (Web IR)
Handout #13: Ranking based on User Behavior Ali Mohammad Zareh Bidoki ECE Department, Yazd University Autumn 2011

Finding Ranking Function
R=f( Query, User behavior, web graph & content features) How can we use the user behavior? Explicit Implicit 80% of user clicks are related to query Click-through data From search Engines log بهترین معیار برای ترکیب رفتار کاربر است. چون در نهایت رضایت کاربر مهم است. Autumn 2011

Click-through data (by Joachims )
Triple (q,r,c) q=query r=ranked list c=set of clicked docs c q r Autumn 2011

Benefits of Using Click through data
Democracy in Web Filling gap between user needs and results User clicks are more valuable that a page content (Search engine precision is evaluated by user no page creators) Degree of relevancy between query and documents will increase (Adding click metadata to document) Autumn 2011

Web Entities 1 2 n Web graph 1 2 n 1 2 w 1 2 q 1 2 m Docs Docs Words
Queries 1 2 m Users Autumn 2011

Document Expansion Using Click TD
First time Google used Anchortext as a document content Anchor text is view of a document from another document Autumn 2011

Long term incremental learning
Di vector of a document in ith iteration Q is vector of the query that this document is clicked Alpha is learning rate Autumn 2011

Naïve Method (NM) A bipartite graph for docs and queries
Mij is number of clicks on document j for query i Autumn 2011

Naïve Method (Cont.) The weight between query qj and document di:
The meta data for document i is: Autumn 2011

Co-Visited Method If two pages are clicked by the same query they called co-visited. The similarity between two docs i and j is (visited(di) shows number of clicks on di and visited(di,dj) shows number of queries in which both are clicked): Autumn 2011

Co-Visited Disadvantages
It only considers documents similarity (not query similarity) As users clicks on top 10 pages, click data are sparse (1.5 queries for each page) So similarity is not precise Autumn 2011

Iterative Method (IM) O(q): set of clicked page for q
Oi(q): the ith clicked page for q I(d): set of queries in which it is clicked on d Ii(d): The ith query in which it is clicked on d Autumn 2011

Experimental Results Experimental results on a real large query click-through log, i.e. MSN query log data, indicate that the proposed algorithm relatively outperforms the baseline search system by 157%, naïve query log mining by 17% and co-visited algorithm by 17% on top 20 precision respectively. Autumn 2011

Web Information retrieval (Web IR)

Similar presentations

Presentation on theme: "Web Information retrieval (Web IR)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Web Information retrieval (Web IR)

Similar presentations

Presentation on theme: "Web Information retrieval (Web IR)"— Presentation transcript:

Similar presentations

About project

Feedback