Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Using Web Snippets and Query-Logs.

Similar presentations


Presentation on theme: "Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Using Web Snippets and Query-Logs."— Presentation transcript:

1 Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos 1, 2, 4 Alípio Jorge 3, 4 Gaël Dias 2 2 Centre of Human Language Tecnnology and Bioinformatics, University of Beira Interior, Covilhã, Portugal QRU 2011 – 2nd International Query Representation and Understanding Workshop in association with SIGIR 2011, Beijing - China, July 28, 2011 1 Tomar Polytechnic Institute, Tomar, Portugal 3 Faculty of Sciences, University of Oporto, OPorto, Portugal 4 LIAAD-INESC Porto L.A, OPorto, Portugal [ w w w. i p t. p t ] [ w w w. l i a a d. u p. p t ] h u l t i g. d i. u b i. p t ]

2 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Query: Lady Gaga Official web site INTRODUCTION Web Snippets Web Query Logs Conclusions Query: Lady Gaga. Official Website MOTIVATIONS Difficulties Objectives Different Approaches in the Extraction of T-I This is a particular hard task that can become even more difficult if the user is not clear in his purpose. 2 - 21

3 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Query: Lady Gaga Informative texts: Rihanna passes Gaga as Facebook's most popular lady… Rumor texts: Lady Gaga, queen of extravagant fashion, is planning to intern for... the milliner confirmed the rumors that the 'Born This Way' singer and he were... INTRODUCTION Web Snippets Web Query Logs Conclusions Query: Lady Gaga. Informative and Rumor texts MOTIVATIONS Difficulties Objectives Different Approaches in the Extraction of T-I 2 - 21

4 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Query: Lady Gaga Biography: Discography Release INTRODUCTION Web Snippets Web Query Logs Conclusions Query: Lady Gaga. Biography and Discography MOTIVATIONS Difficulties Objectives Different Approaches in the Extraction of T-I 2 - 21

5 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Query: Lady Gaga Tour Dates: INTRODUCTION Web Snippets Web Query Logs Conclusions Query: Lady Gaga. Tour Dates Understanding the temporal nature of a query, namely of implicit ones, is one of the most interesting challenges (Berberich et al (2010)) in (T-IR) that would enable to apply specific strategies to improve web search results retrieval. MOTIVATIONS Difficulties Objectives Different Approaches in the Extraction of T-I 2 - 21

6 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias However, this may prove to be a particularly difficult task and a hard challenge: 1. Different semantic concepts can be related to a query: 2. Difficult to define the boundaries between what is temporal and what is not and so is the definition of temporal ambiguity; 3. Even if temporal intents can be inferred by human annotators, the question is how to transpose this to an automatic process. INTRODUCTION Web Snippets Web Query Logs Conclusions Motivations DIFFICULTIES Deal with Implicit Temporal Queries is Difficult Objectives Different Approaches in the Extraction of T-I 3 - 21

7 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias In our work we aim to understand whether temporal information can be used to automatically disambiguate query terms, namely implicit temporal queries. INTRODUCTION Web Snippets Web Query Logs Conclusions Understand the Temporal Nature of Implicit Temporal Queries Motivations Difficulties OBJECTIVES Different Approaches in the Extraction of T-I 4 - 21

8 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Usually the extraction of temporal information is based on a metadata-based approach upon time-tagged controlled collections such as news articles, using the timestamp of the document. Jun 16, 2009 – The city of São Paulo shall have to make use of the Credicard Hall as the venue for the 2011 Miss Universe. Today was also announced that Miss Morumbi show is going to be on July 27, 2009. From Miss Universe.ComMiss Universe.Com This information can be particularly useful to date relative temporal expressions found in a document (e.g., today) with a concrete date (e.g., document creation time): However, it can be a tricky process if used to date implicit temporal queries as the time of the document can differ significantly from the actual content of the document; Metadata-Based Approach INTRODUCTION Web Snippets Web Query Logs Conclusions Motivations Difficutlies Objectives DIFFERENT APPROACHES IN THE EXTRACTION OF T-I 5 - 21

9 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias One possible solution is to seek for related temporal references over complementary web resources: Query-Log Resources, based on similar year-qualified queries Simply requires the set of web search results. Imply that some versions of the query have already been issued. Content Approach. Query-Logs. Query-Dependency Content-Related Resources, based on a web content approach INTRODUCTION Web Snippets Web Query Logs Conclusions Motivations Difficutlies Objectives DIFFERENT APPROACHES IN THE EXTRACTION OF T-I 6 - 21

10 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Content-Related Resources Query-Log Resources Conclusions Introduction Web Snippets Web Query Logs Conclusions 7 - 21

11 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias One of the most interesting approaches to date implicit temporal queries is to rely on the exploration of temporal evidence within web pages: Introduction WEB SNIPPETS Web Query Logs Conclusions Temporal Evidence within Web Pages Difficulties Temporal Value TEMPORAL INFORMATION Temporal Classification 8 - 21

12 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias The use of web documents to date queries not entailing any temporal information can be however a tricky process. The main problem is related to the difficulties underlying the association of the year date found in the document and the query: Introduction WEB SNIPPETS Web Query Logs Conclusions DIFFICULTIES Temporal Value Temporal Information Temporal Classification Correlation between the Dates and Query Concepts 9 - 21

13 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias 450 Oil Spill; BP Oil Spill; Waka Waka; In this work we aim to determine the temporal value of web snippets: TSnippets = # Snippets Retrieved with Dates # Snippets Retrieved TSnippets(.) TTitle(.) TUrl(.) Introduction WEB SNIPPETS Web Query Logs Conclusions Measures Difficulties TEMPORAL VALUE Temporal Information Temporal Classification 10 - 21

14 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Conceptual ClassificationNumber Queries Ambiguous220 Clear176 Temporal ClassificationNumber Queries% ATemporal13275% Temporal4425% Broad54 If (TA(q) < 10%) then Query is ATemporal Else Query is Temporal Each query was classified on the basis of a temporal ambiguity value: Introduction WEB SNIPPETS Web Query Logs Conclusions Difficulties Temporal Value Temporal Information TEMPORAL CLASSIFICATION Temporal Ambiguity Value 11 - 21

15 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias In order to evaluate our simple classification model, we conducted a user study; Human annotators were asked to consider each of the 176 queries, to look at web search results and to classify them as ATemporal or Temporal; Introduction WEB SNIPPETS Web Query Logs Conclusions Evaluation Difficulties Temporal Value Temporal Information TEMPORAL CLASSIFICATION Overall, results pointed at 35% of implicit temporal queries from human annotators, while only 25% were given by our methodology; 12 - 21

16 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Another approach to date implicit temporal queries is to use web query logs based on similar year-qualified queries: Introduction Web SnippetsConclusions WEB QUERY LOGS Bp oil spill Bp oil spill live feed Bp oil spill 2010 Bp oil spill map Bp oil spill claims Completion Search-Engine Features Difficulties Temporal Value TEMPORAL INFORMATION 13 - 21

17 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Extremely hard to access outside the big industrial labs; Queries that have never been typed, thus not existing in the web search log e.g. Blaise Pascal 1623 (his year birth date) Highly dependent on the user own intents: Not adapted to concept disambiguation; Query: Euro Euro 2008; Euro 2012; Introduction Web SnippetsConclusions WEB QUERY LOGS Web Query Logs Drawbacks DIFFICULTIES Temporal Value Temporal Information 14 - 21

18 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Explicit temporal queries only represent 1.21% of the overall set [5]; Introduction Web Snippets WEB QUERY LOGS Conclusions Temporal Information Furthermore, we must also take into account that the simple fact that a query is year-qualified does not necessarily mean that it has a temporal intent; Similarly to TTitle(.), TSnippets(.) and TUrl(.) TLogYahoo(.) TLogGoogle(.) Difficulties TEMPORAL VALUE Measures TLogGoogle = #Suggested Queries Retrieved with Dates # Suggested Queries Retrieved 15 - 21

19 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Pearson correlation coefficient between each of the dimensions: TSnippets(.) TTitle(.) TUrl(.) Results show that: TLogGoogle(.) TLogYahoo(.) TLogGoogleTTitleTSnippetTUrl TLogYahoo0.630.610.520.48 TLogGoogle0.690.630.44 This means that as dates appear in the titles and snippets, they also tend to appear, albeit in a more reduced form, in the auto-complete query suggestion of Google. Introduction Web Snippets WEB QUERY LOGS Conclusions Results Difficulties TEMPORAL VALUE Temporal Information 16 - 21

20 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias An additional analysis led us to conclude that the temporal information is more frequent in web snippets than in any of the query logs of Google and Yahoo!; Overall, while most of the queries have a TSnippet(.) value around 20%, TLogYahoo(.) and TLogGoogle(.) are mostly near to 0%. Introduction Web Snippets WEB QUERY LOGS Conclusions Results Difficulties TEMPORAL VALUE Temporal Information 17 - 21

21 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Finally, we studied how strongly a given query is associated to a set of different dates, both in web snippets and in web query logs. For this, we have built a confidence interval for the difference of means, for paired samples, between the number of times that the dates appear in the web snippets and in web query logs: TLogGoogle(.) TLogYahoo(.)[5.10; 6.38] [5.12; 6.43] Results show that the number of different dates that appear in web snippets is significantly higher than in either one of the two web query logs. Introduction Web Snippets WEB QUERY LOGS Conclusions Results Difficulties TEMPORAL VALUE Temporal Information 18 - 21

22 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias In this paper, we showed that web snippets are a very rich source of temporal information, especially years. Dates often appear correlated in snippets and titles. Results show that future dates are very common in web snippets, but seldom used in Queries; Dates mostly appear together with the categories of automotive, sports, politics, both in web snippets and web query logs; Some of the items have even more than one date; Introduction Web Snippets Web Query Logs CONCLUSIONS Contrary to web snippets, web query logs have a very small temporal value (at about 1.2%), which is statistically smaller when compared to the former; Temporal Value of Web Snippets and Web Query Logs 19 - 21

23 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Our experiments, also showed that web snippets can be used for query understanding; So, the use of complementary information, such as the number of instances or the number of different dates, should be considered in future approaches; Introduction Web Snippets Web Query Logs CONCLUSIONS We introduced a simple model for the temporal classification of queries based on the temporal value of web snippets that showed that 25% of the queries have a temporal nature. These values contrast with the 35% resulted from our user study; Query Understanding based on Web Snippets 20 - 21

24 [ w w w. l i n k e d i n. c o m / i n / c a m p o s r i c a r d o] [w w w. c c c. i p t. p t / ~ r i c a r d o] Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Thanks for your attention! Both experimental datasets are available for download at www.ccc.ipt.pt/~ricardo/software www.ccc.ipt.pt/~ricardo/software VipAccess is online at http://hultig.di.ubi.pt/vipaccesshttp://hultig.di.ubi.pt/vipaccess Web Snippets Web Query Logs Conclusions Introduction HULTIG is online at http://hultig.di.ubi.pthttp://hultig.di.ubi.pt LIAAD is online at http://liaad.up.pthttp://liaad.up.pt Polytechnic Institute of Tomar is online at http://www.ipt.pthttp://www.ipt.pt Gaël Dias is online at http://www.di.ubi.pt/~ddghttp://www.di.ubi.pt/~ddg Alípio Jorge is online at http://liaad.up.pt/~amjorgehttp://liaad.up.pt/~amjorge 21 - 21


Download ppt "Using Web Snippets and Query-Logs to Measure Implicit Temporal Intents in Queries Ricardo Campos, Alípio Jorge, Gaël Dias Using Web Snippets and Query-Logs."

Similar presentations


Ads by Google