Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integrating term dependencies according to their utility Jian-Yun Nie University of Montreal 1.

Similar presentations


Presentation on theme: "Integrating term dependencies according to their utility Jian-Yun Nie University of Montreal 1."— Presentation transcript:

1 Integrating term dependencies according to their utility Jian-Yun Nie University of Montreal 1

2 Need for term dependency The meaning of a term often depends on other terms used in the same context – Term dependency – E.g. computer architecture, hot dog, … Unigram model is unable to capture term dependency – hot + dog ≠ "hot dog" Dependency: a group of terms (a pair of terms) 2

3 Previous approaches Phrase + unigram – 2 representations: phrase model and unigram model – Interpolation (each model with a fixed weight) – Assumption: phrases represent useful dependencies between terms for IR – E.g. Q = the price of hot dog P Unigram : price, hot, dog P Phrase : price, hot_dog P(price hot dog|D) =  P phrase (price hot dog|D) + (1-  P Unigram (price hot dog|D) or score =  score phrase + (1-  score unigram – Effect: documents with the phrase “hot dog” have a higher score 3

4 Dependency model Dependency language model (Gao et al. 2005) – Determine the strongest dependencies among query terms (a parsing process): – price hot dog – The determined dependencies define an additional requirement for documents: Documents have to contain the unigrams Documents have to contain the required dependencies The two criteria are linearly interpolated 4

5 Markov Random Field (MRF) (Metzler&Croft) Sequential Full Potential function Sequential model: Interpolation of unigram model, ordered bigram and unordered bigram 5

6 Limitations The importance of a (type of) dependency is fixed in the combined model in the same way for all the queries – A fixed weight is assigned to each component model price-dog is as important as hot-dog (dependency model) price-hot is as important as hot-dog (MRF) in the ordered model Are they equally strong dependencies? – hot-dog > price-dog, price-hot Intuition: a stronger dependency forms a stronger constraint 6

7 Limitations Can a phrase model solve this problem? – Some phrases form a semantically stronger dependency than some others hot-dog > cute-dog Sony digital-camera > Sony-digital camera, Sony-camera digital – Is a semantically stronger dependency more useful for IR? Not necessarily digital-camera could be less useful than Sony-camera The importance of a dependency in IR depends on its usefulness to retrieve better documents. 7

8 Limitations MRF sequential model – Only consider consecutive pairs of terms – No dependency between distant terms Sony digital camera: Sony-digital, digital-camera Full model – Can cover long distance dependencies – But large increase in complexity 8

9 Proximity: more flexible dependency Tao&Zhai, 2007 Zhao&Yun 2009 Prox B (w i ): proximity centrality – Min/average/sum dist. to the other query terms However, is still fixed. 9

10 A recent extension to MRF model Bendersky, Metzler, Croft, 2010 – Weighted dependencies – w j uni and w j bi : the importance of different features – g j uni and g j bi : the weight of each unigram and bigram according to its utility – However f o and f u are mixed up Only consider dependency between pairs of adjacent terms 10

11 Go further Using discriminative model instead of MRF – Can consider dependencies between more distant terms, without having the exponential complexity growth We only consider pair-wise dependencies Assumption: pair-wise dependencies capture the most important part of dependencies Consider several types of dependencies between query terms – Ordered bigram – Unordered pair of terms within some distance (2, 4, 8, 16) Dependencies at different distances have different strengths Co-occurrence dependency ~ variable proximity 11

12 General discriminative model Breaking down each component model to consider the strength/usefulness of a term dependency U, B, C w : importance of a unigram, a bigram and a co- occurrence pair within distance w in documents 12

13 An example corporate pension plans funds corporatepensionfundsplans.50.07.60.08.80.70.80.2 0.07.6 0.3 5 bi co2 co4 co8 (co16 omitted) 13

14 Further development Set U at 1 and vary the other Features: 14

15 How to determine the usefulness of a bigram and a co- occurrence pair B and  Cw ? - Using a learning method based on some features - Cross-validation 15

16 Learning method Parameters Goal: – T i : Training data – R i : Document ranking using the parameters – E: measure of effectiveness (MAP) Training data: – {x i, z i } a bigram or a pair of term within distance w and its best value for the query – Finding the best value by coordinate-level ascendent search Epsilon SVM with radial basis kernel function 16

17 Features 17

18 Test collections 18

19 Results with other models 19

20 With our model 20

21 Analysis Some intuitively strong dependencies should not be considered as important in the retrieval process Disk1-query 088:“crude oil price trends” – Ideal weights (bi,co2,4,8,16)=0, AP=0.103 – leant bi=0.2, co2..16=0, AP=0.060 Disk1-query 003: “joint ventures” – Ideal weights (bi,co2,4,8,16)=0, AP=0.086 – leant bi=0.07,co2..16=0, AP=0.084 Disk1-query 094: “computer aided crime” – Ideal weights (bi,co2,4,8,16) =0, AP=0.223 – leant bi=0.3, co2..16=0, AP=.158 21

22 Analysis Some intuitively weakly connected words should be considered as strong dependencies: Disk1-query184: “corporate pension plans funds” – Ideal wt.bi=0.5, co2=0.7, co4=0.2, AP=0.253 – Learnt wt.bi=0.2,co8=0.01, co16=0.001, AP=0.201 (Uni=0.131) Disk1-query115: “impact 1986 immigration law” – Ideal wt.co2=0.1, co4=0.35, co8=0.05, AP=0.511 – Learnt wt.bi=0, co16=0.01, AP=0.492 (Uni=0.437) 22

23 Disk1-query115: “impact 1986 immigration law” Ideal AP =0.511, uni=0.437, learnt=0.492.01.35 impact1986immigr.law.01.10.35.01.03.05 (Learnt)imp-1986imp-immimp-law1986-imm1986-lawimm-law wt.bi--.14--- wt.co2-----.05 wt.co8-.01 - wt.co16-.01.02 bi co2 co4 co8 (co16 omitted) 23

24 Disk1-query184: “corporate pension plans funds” AP ideal=0.253, uni=0.132, learnt=0.201 corporatepensionfundsplans.50.07.60.08.80.70.80.2 0.07.6 0.3 5 (Learnt)corp-pencorp-plancorp-fundpen-planpen-fundplan-fund wt.bi---.20.18- wt.co2-.05-.59.23- wt.co8-.01-.02.01 wt.co16-.02.04-.001 bi co2 co4 co8 (co16 omitted) 24

25 Typical case 1: weak bigram dependency, weak co-occurrence dependency 25

26 Typical case 2: strong dependencies 26

27 Typical case 3: Weak bigram dependency, strong co-occurrence dependency 27

28 Conclusions Different types of dependency between query terms to be considered They have variable importance/usefulness for IR, and should be integrated in IR model with different weights. – Not necessarily correlate with semantic dependency The new model is better than the existing models in most cases (stat. significance in some cases) 28


Download ppt "Integrating term dependencies according to their utility Jian-Yun Nie University of Montreal 1."

Similar presentations


Ads by Google