Presentation is loading. Please wait.

Presentation is loading. Please wait.

Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs Ryen W. White1 Jeff Huang2 1Microsoft Research 1University of Washington.

Similar presentations


Presentation on theme: "Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs Ryen W. White1 Jeff Huang2 1Microsoft Research 1University of Washington."— Presentation transcript:

1 Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs
Ryen W. White1 Jeff Huang2 1Microsoft Research 1University of Washington SIGIR Effectiveness Measures Session Summarized and presented by Hwang Inbeom, IDS Lab., Seoul National University

2 Introduction Individual documents in the result list of keyword search may only serve as the starting points for exploration Search trails Series of web pages Start with a search query and terminate with an event as session inactivity Search query Result page 1 Page linked by result page 1 Result page 2 Result page 3 Closing of browser window

3 Introduction If search engine shows search trails, rather than pages, how much benefit users can get? Primary aim of this research is estimating the benefit that trail following brings to users Knowing if and when it helps Conducted a study using log-based analysis methodology

4 Related Work Studies on search engine click logs
Agichtein, Brill and Dumais, Incorporating web search ranking by incorporating user behavior information, SIGIR 2006 Joachims, Optimizing search engines using click-through data, SIGKDD 2002 Study on users’ teleportation, which is jumping directly to destination pages after issuing sophisticated queries in an attempt to navigate to a page users knew before Teevan et al., The perfect search engine is not enough: a study of orienteering behavior in directed search

5 Related Work Trails studied outside of IR domain
Footprints (Wexelblat and Maes, 1999): annotations in Web browsers, which are trails through a Web site assembled by the Web site designer Information foraging (Pirolli and Card, 1999): derived from how animals forage for food in the wild, cues left by previous visitors are used to find “patches” of information ScentTrails (Olston and Chil, 2003): combines browsing and searching into a single interface by highlighting potentially valuable hyperlinks Contribution of this study Estimation of value that trail following brings to users

6 Search Trail Consists of
An origin page Intermediate pages A destination page In this study, authors consider four information sources that can be extracted from the trails Origin: the first page in the trail after the search engine result page, visited by clicking on a search result hyperlink Destination: the last page in the trail Sub-trail: all pages in the trail except for destination Full-trail: the complete trail P2 P5 {P2, P3, P4, P3, P2} {P2, P3, P4, P3, P2,P5}

7 Log Data Anonymized logs of URLs visited by users who opted in to provide data through a widely-distributed browser toolbar During a three-month period from March 2009 through May 2009 Included entries generated in English-speaking US locale only User ID Browser window ID Timestamp URL

8 Trail Mining Collected tens of millions of search trails (Tx) from the May 2009 Not session trails: user intent tends to shift over the course of the session Trails from March to April of 2009 are used as historic trails (Th) A trail is temporally-ordered sequence of URLs Beginning with a search engine query Terminating with Another query A period of user inactivity of 30 or more minutes Termination of browser instance or tab

9 Trail Labeling Three of five evaluation metrics in this study use information about page topicality Millions of unique URLs were present in the sell of all trails Web pages are classified into the topical hierarchy from ODP Using a method similar to Shen, Dumais and Horvitz, Analysis of topic dynamics in web search, WWW 2004

10 Trail Statistics Some 15,000,000 search trails followed by some 100,000 users Number of trails per user: median was 91, mean was 160 Number of steps per trail Median was 2 (search result page and a single result click), mean was 5.3 1/3 of trails were abandoned following the query 1/3 of trails contained 3 or more pages 19.3% of them contained at least one site with a different ODP label to the origin and destination pages Other 80.7% of trails revealed that their original queries were generally navigational (e.g. delta airlines) or directed informational (e.g. what is daylight savings time?)

11 Study 5 research questions
Which of four sources (origin, destination, sub-trail and full-trail) Provide more relevant information? Provide more topic coverage? Provide more diversity? Provide more novel information? Provide more useful information? (Relevance) (Coverage) (Diversity) (Novelty) (Utility)

12 Trail Data Preparation
Queries are normalized (removed punctuations, lowercased, terms ordered alphabetically, trimmed extraneous whitespaces) Trails containing less than three pages are removed To have at least one intermediate page between origin and destination pages Referrer of the origin page must be search engine result page Trails are fully labeled Trails containing pages which do not have ODP labels are removed To prevent sample bias from highly-active users, authors selected at most 10 search trails from each user

13 Metrics Higher value is regarded as a more positive outcome
All metrics are computed for each trail, then micro-averaged within each query, and then macro-averaged across all queries All queries are treated equally in the analysis Relevance Obtained human relevance judgments for over 20,000 queries randomly sampled by frequency from the query logs of the Bing search engine Trained judges assigned relevance labels on a 6-point scale (Bad, Poor, Fair, Good, Excellent, Perfect) to top-ranked Web search results Authors computed the average relevance score for each source Each page in the trail was used at most once Trails for 8,712 queries, comprising some 2,000,000 trails afforded a detailed comparison of source relevance

14 Metrics Coverage Meant to reflect the value of each trail source in providing access to the central themes of query topic Query topic is obtained from constructed query interest models (Qx) ODP labels are assigned to URLs in the union of top-200 search results for that query from Google, Yahoo! and Bing Labels are grouped and their frequency values are normalized Coverage of each source s is obtained by summing weight of labels tagged to pages in sources

15 Metrics Diversity Novelty
Estimates the fraction of unique query-relevant concepts surfaced by a given trail source Novelty Novel information may help users learn about a new subject area or broaden their understanding of an area Users’ historic interest models (Hx) for all user-query pairs are constructed using users’ search trails from March 2009 to April 2009

16 Metrics Utility 30 seconds or more dwell time on a Web page can be indicative of page utility Estimated the fraction of page views from each sources that exceed dwell time threshold

17 Methodology Construct the set of query interest models Qx based on the set of queries for which authors have human relevance judgments Construct historic interest models for each user-query pair in historic trails set, filtered to only include queries appearing in Qx For each search trail t Assign ODP labels to all pages in t Build source interest models for origin, destination, sub-trail and full-trail Compute relevance, coverage, diversity, novelty and utility scores Compute the average values for each metric per query, then average across all queries

18

19 Findings from All Queries
Relevance Scores for all sources were generally positive (around 3 or good), and origin pages have slightly higher scores than non-origin sources However, scores obtained from all sources revealed no significance difference, computing independent-measures analysis of variance (ANOVA) Coverage, Diversity and Novelty Sub-trails and Full-trails cover (statistically) more topics and reveals (statistically) more diverse topics than origin and destination pages. These topics are more new to users Utility Non-origin pages are more useful to users

20 Effect of Query Popularity
Queries are divided into three groups Low: queries are issued by at most one user from March to April Medium: queries are issued by between 1 and 100 users High: queries are issued by more than 100 users As query popularity increases, all metric scores increase Relative ordering and percentage gains remain consistent Differences across three popularity groupings are not statistically signi-ficant

21 Effect of Query History
Three groups None: the user did not issue the query before from March to April Some: the user issued the query for 30 times or less Lots: the user issued the query for more than 30 times Relevance and utility rise across all sources given increased re-finding behavior, but coverage, diversity and novelty decrease Perhaps because users are more familiar with the query topic and are more able to identify information they need

22 Discussion and Implications
All metrics but relevance increase in full-trails and sub-trails During the session, user intent may shift, and relevance to the initial query is dynamic Pages encountered on the trails may be relevant but not appear so Destinations were more useful and led to a slight novelty increase over origins In novelty, diversity and utility, differences between origins and sub-trails are substantial Full-trails provide even more benefit than sub-trails as expected Some queries for focused tasks, trails may not be useful

23 Discussion and Implications
Trails can provide values to users But questions remain about how to select trails and how to integrate them into the search engine result page Popular search trails are typically short and obvious, so we need to consider diverse and unexpected trails Trail selection methods could discount trails with rapid backtracking and maximize relevance, coverage, diversity, novelty and utility with the shortest path Alternatively, personalization of trail recommendation could be possible

24 Conclusions Authors systemically compared the estimated value of trails to other trail components using log-based methodology Full-trails and sub-trails provide users with significantly more topic coverage, topic diversity and novelty than trail origins, and slightly more useful but slightly less relevant information than the origins The next steps are to investigate best-trail selection for query-origin pairs and add trails effectively to search engine result pages


Download ppt "Assessing the Scenic Route: Measuring the Value of Search Trails in Web Logs Ryen W. White1 Jeff Huang2 1Microsoft Research 1University of Washington."

Similar presentations


Ads by Google