Download presentation

Presentation is loading. Please wait.

Published byEmmalee Pierpoint Modified over 2 years ago

1
Exploiting Temporal References in Text Retrieval Irem Arikan advised by: Srikanta Bedathur, Klaus Berberich

2
Motivation users’ information needs often have a temporal dimension, but traditional information retrieval systems do not exploit the temporal content in documents. query: PM United Kingdom 2000 search engine is not aware that 2000 is actually mentioned implicitly by the document an approach which recognizes and exploits temporal references in documents to yield better search results !

4
Example Temporal Queries Broad Queries British colony 17 th century Economic situtation Germany 1920s President assasination 1950 – 2000 Specific Queries US president October 1962 Pope 1940s Academy awards best actor 1975 Ambiguous Queries George Bush 1990 vs. George Bush 2007 Gulf war 1991 vs. Gulf war 2005

5
Language Modeling for Information Retrieval Time Modeling for Temporal Information Retrieval Combining Text Relevance with Temporal Relevance Experimental Results Outline

6
Language Model: a statistical model to generate text Language Modeling: the task of estimating the statistical parameters of a language model Language Modeling for IR: the problem of estimating the likelihood that a query and a document could have been generated by the same language model In practical IR approaches: Unigram Language Model words occur independently Language Modeling for Information Retrieval

7
1)document : a sample from a language model assume an underlying multinomial probability distribution over words for each document estimate statistics of this distribution: P[word] 2)estimate the likelihood that the query is generated by this distribution 3)rank the documents by P(q | d ) document infer M d : P [ word | M d ] Language Modeling for IR

8
General approach similar to LM approach based on a generative model which generates temporal references temporal model splits query into 2 parts: text query and temporal query Probabilistic mechanism for producing temporal content of the document each time reference generated by a different generative temporal model for generating a time reference 1)first choose a temporal model 2)then generate a time reference using this temporal model Temporal Modeling for Temporal Retrieval

9
Estimating temporal query likelihood Infer a temporal model from each temporal reference in the document Estimate the likelihood that the temporal query is generated by one of the models which generated the temporal content of the document Temporal query generation probability Temporal Modeling

10
A probabilistic model to generate temporal references What kind of distribution? How can we estimate its parameters? What is a temporal model?

11
Temporal Modeling A probabilistic model to generate temporal references What kind of distribution? How can we estimate its parameters? Formalize the problem in a goal-oriented way, We should infer a temporal model from each time interval (sample time interval) This temporal model should be able to generate all time intervals which are relevant to the sample interval What is a temporal model?

12
Assumptions: only relevant if they intersect the generative model inferred should be able to produce subintervals, superintervals, overlapping intervals of the interval in the document probability of generating an intersecting time interval should be proportional to the length of intersection query: 1980 – 1990 1980 – 1989 is more relevant than 23 March 1984 Appropriate probabilistic model: 2 underlying triangular distributions one for start, one for end, sub2 lOverlap s t sub1 sup1 sup2 rOverlap e 1. Approach

13
Parameters Support Triangular Distribution

14
se +1 u elq s - 1 1. Approach r1r2r3r4 nonzero probability for intersecting intervals r1 – r3 : left overlaps r1 – r4 : super intervals r2 – r3 : subintervals r2 - r4 : right overlaps interval [s,e] has the highest probability probability decreases to the left and right resulting in lower probability for intervals which have smaller intersection lengths

15
se +1 u el q s - 1 1. Approach r1r2 r3r4

16
Assumptions: Only relevant if they are positioned closely to each other on the time axis and have similar lengths | start1 – start2 | < a | length1 – length2 | < b The generative model inferred should be able to produce temporal intervals in some neighbourhood on the time axis s l t ∆s ∆l 2. Approach

17
ss+as -a ll+b l-b 2. Approach Temporal interval x = s, y = l has the highest probability Probability decreases as start point moves away from s and as length moves away from l

18
ss+as -a ll+b l-b 2. Approach

19
Text relevance Combining Text Relevance with Temporal Relevance

20
Text relevance Combining Text Relevance with Temporal Relevance Temporal relevance Filter and re-rank search results by weighting text relevance score by temporal relevance

21
Information Retrieval (IR) with Temporal Extension IR System Query Index Result Set System Architecture Temporal Query Index for temporal references Temporal Retrieval Result Set

22
Query: Spanish painter 18th century Experimental Results-1 TerrierBooleanOur Method Art_in_Puerto_RicoAgustín_EsteveJosé_del_Castillo Spanish_artAcislo_Antonio_Palomino_ de_Castro_y_Velasco Agustín_Esteve Palazzo_Bianco_(Genoa)AlvarezRoybal CaprichosAgostino_Scilla_00e6Maldonado List_of_people_from_Antw erp BassanoLuis_Egidio_Meléndez

23
Query: Chancellor Germany 1955 Experimental Results-2 TerrierBooleanOur Method Federal_Minister_for_Speci al_Affairs_of_Germany Basic_Law_for_the_Federal _Republic_of_Germany Occupation_statute Otto_GesslerBonn-Paris_conventionsSecond_German_Bundestag Bonn-Paris_conventionsBavaria_PartyWest_Germany Occupation_statuteAll-German_Bloc_League_ of_Expellees_and_Deprived _of_Rights Bonn-Paris_conventions Petersberg_AgreementAnschlussKonrad_Adenauer

24
Query: George Bush 1990 Experimental Results-3 TerrierBooleanOur Method George_W._Bush_insider_tr ading_allegations Bush_familyPresident_Bush Bush_familyBush_administration Early_life_of_George_W._B ush Andrew_CardPresident's Council of Advisors on Science and Technology George_H._W._BushApproval_ratingGeorge_H._W._Bush C_Boyden_GrayBrent_ScowcroftArbusto_Energy

25
Thanks!

Similar presentations

OK

Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.

Navigation Aided Retrieval Shashank Pandit & Christopher Olston Carnegie Mellon & Yahoo.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google