Presentation is loading. Please wait.

Presentation is loading. Please wait.

Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler.

Similar presentations


Presentation on theme: "Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler."— Presentation transcript:

1 Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler

2 Bag of Words Representation 71 the31 garden 22 and19 of 19 to19 house 18 in18 white 12 trees12 first 11 a11 president 8 for7 as 7 gardens7 rose 7 tour6 on 6 was6 east 6 tours5 planting 5 he5 is 5 grounds5 that 5 gardener4 history 4 text-decoration4 john 4 kennedy4 april 4 been4 today 4 with4 none 4 adams4 spring 4 at4 had 3 mrs3 lawn… 71 the31 garden 22 and19 of 19 to19 house 18 in18 white 12 trees12 first 11 a11 president 8 for7 as 7 gardens7 rose 7 tour6 on 6 was6 east 6 tours5 planting 5 he5 is 5 grounds5 that 5 gardener4 history 4 text-decoration4 john 4 kennedy4 april 4 been4 today 4 with4 none 4 adams4 spring 4 at4 had 3 mrs3 lawn…

3 Bag of Words Representation

4 Bag of Words Models

5 Binary Independence Model

6 Unigram Language Models Language modeling first used in speech recognition to model speech generation In IR, they are models of text generation Typical scenario Estimate language model for every document Rank documents by the likelihood that the document generated the query Documents modeled as multinomial distributions over a fixed vocabulary

7 Unigram Language Models Query:Document: Ranking via Query Likelihood P( | ) = P( | )∙P( | )∙P( | ) Estimation P( | ) P( | θ ) P( | θ ) P( θ )

8 Bag of Words Models Pros Simple model Easy to implement Decent results Cons Too simple Unrealistic assumptions Inability to explicitly model term dependencies

9 Beyond Bags of Words

10 Tree Dependence Model Method of approximating complex joint distribution Compute EMIM between every pair of terms Build maximum spanning tree Tree encodes first order dependencies A D B E C

11 n-Gram Language Models

12 Query:Document: Ranking via Query Likelihood P( | ) = P( | )∙P( |, )∙P( |, ) Estimation

13 Dependency Models Pros More realistic assumptions Improved effectiveness Cons Less efficient Limited notion of dependence Not well understood

14 State of the Art IR

15 Desiderata Our desired model should be able to… Support standard IR tasks (ranking, query expansion, etc.) Easily model dependencies between terms Handle textual and non-textual features Consistently and significantly improve effectiveness over bag of words models and existing dependence models Proposed solution: Markov random fields

16 Markov Random Fields MRFs provide a general, robust way of modeling a joint distribution The anatomy of a MRF Graph G vertices representing random variables edges encode dependence semantics Potentials over the cliques of G Non-negative functions over clique configurations Measures ‘compatibility’

17 Markov Random Fields for IR

18 Modeling Dependencies

19 Parameter Tying In theory, a potential function can be associated with every clique in a graph Typical solution is to define potentials over maximal cliques of G Need more fine-grained control over our potentials Use clique sets Set of cliques that share a parameter and potential function We identified 7 clique sets that are relevant to IR tasks

20 Clique Sets Single term document/query cliques T D = cliques w/ one query term + D ψ(domestic, D), ψ(adoption, D), ψ(laws, D) Ordered terms document/query cliques O D = cliques w/ two or more contiguous query terms + D ψ(domestic, adoption, D), ψ(adoption, laws, D), ψ(domestic, adoption, laws, D) Unordered terms document/query cliques U D = cliques w/ two or more query terms (in any order) + D ψ(domestic, adoption, D), ψ(adoption, laws, D), ψ(domestic, laws, D), ψ(domestic, adoption, laws, D)

21 Clique Sets Single term query cliques T Q = cliques w/ one query term ψ(domestic), ψ(adoption), ψ(laws) Ordered terms query cliques O Q = cliques w/ two or more contiguous query terms ψ(domestic, adoption), ψ(adoption, laws), ψ(domestic, adoption, laws) Unordered terms query cliques U Q = cliques w/ two or more query terms (in any order) ψ(domestic, adoption), ψ(adoption, laws), ψ(domestic, laws), ψ(domestic, adoption, laws) Document clique D = singleton clique w/ D ψ(D)

22 Features

23

24

25 Parameter Estimation Given a set of relevance judgments R, we want the maximum a posteriori estimate: What is P( Λ | R )? P( R | Λ ) and P( Λ )? Depends on how model is being evaluated! Want P( Λ | R ) to be peaked around the parameter setting that maximizes the metric we are interested in

26 Parameter Estimation

27 Evaluation Metric Surfaces

28 Feature Induction

29 Query Expansion

30

31 Query Expansion Example Original query: hubble telescope achievements

32 Query Expansion Example Original query: hubble telescope achievements

33 Applications

34 Application: Ad Hoc Retrieval

35 Ad Hoc Results

36 Application: Web Search

37 Application: XML Retrieval

38 Example XML Document The Tragedy of Romeo and Juliet ACT I PROLOGUE NARRATOR Two households, both alike in dignity, In fair Verona, where we lay our scene, From ancient grudge break to new mutiny, Where civil blood makes civil hands unclean. From forth the fatal loins of these two foes A pair of star-cross’d lovers take their life; … SCENE I. Verona. A public place. SAMPSON Gregory, o’ my word, we’ll not carry coals. …

39 Content and Structure Queries NEXI query language derived from XPath allows mixture of content and structure // scene [about(., poison juliet)] Return scene tags that are about poison and juliet. // * [about(., plague both houses)] Return elements (of any type) about plague both houses. // scene [about(., juliet chamber)]// speech [about(.//line, jewel)] Return speech tags about jewel where the scene is about juliet chamber.

40

41 Application: Text Classification

42 Conclusions


Download ppt "Beyond Bags of Words: A Markov Random Field Model for Information Retrieval Don Metzler."

Similar presentations


Ads by Google