Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cumulated Gain-Based Evaluation of IR Techniques

Similar presentations


Presentation on theme: "Cumulated Gain-Based Evaluation of IR Techniques"— Presentation transcript:

1 Cumulated Gain-Based Evaluation of IR Techniques
Liu bingbing

2 Motivation There are so many different kinds of IR techniques , but which one is better? And how to evaluate these techniques?

3 Outline Introduction Cumulated gain-based measurements
Case study : comparison of some TREC-7 results at different relevance levels Discussion

4 Outline Introduction Cumulated gain-based measurements
Case study : comparison of some TREC-7 results at different relevance levels Discussion

5 Background Highly relevant documents should be identified and ranked first It’s necessary to develop measures to evaluate different IR techniques

6 Old measures Highly and marginally relevant documents are given equal credit IR documents are judged relevant or irrelevant Graded relevance judgments

7 New measures CG DCG nCG nDCG

8 Outline Introduction Cumulated gain-based measurements
Case study : comparison of some TREC-7 results at different relevance levels Discussion

9 Principles Highly relevant documents are more important than marginally relevant ones Documents found late are less important

10 Relationship CG G BV n(D)CG DCG

11 Direct Cumulated Gain (CG)
For example G `=<3, 2, 3, 0, 0, 1, 2, 2, 3, 0, : : :> CG`=<3, 5, 8, 8, 8, 9, 11, 13, 16, 16, : : :>

12 Discounted Cumulated Gain (DCG)
For example G`=<3, 2, 3, 0, 0, 1, 2, 2, 3, 0, : : :> DCG `=<3, 5, 6.89, 6.89, 6.89, 7.28, 7.99, 8.66, 9.61, 9.61, : : :>

13 Best possible Vectors Theoretically

14 A sample ideal gain vector (BV)
CG`=<3, 6, 9, 11, 13, 15, 16, 17, 18, 19, 19, 19, 19, : : :> DCG`=<3, 6, 7.89, 8.89, 9.75, 10.52, 10.88, 11.21, 11.53, 11.83, 11.83, 11.83, : : :> base=2

15 Relative to the Ideal Measure—the Normalized (D)CG Measure
Norm-vect (V, I)=<v1/i1, v2/i2, : : : , vk/ik> For example nCG=norm-vect( CG, CGI) nDCG=norm-vect(DCG,DCGI)

16 Comparison to Earlier Measures
Average search length (ASL) estimate the average position of a relevant document Expected search length (ESL) average number of documents that must be examined to retrieve a given number of relevant documents ………………. Both of them either don’t take the degree of document relevance into account or depend on the retrieved list size or …

17 The strengths of new measures -CG,DCG,NCG,NDCG
Take the degree of relevance of document into account Don’t depend on the size of recall base Don’t depend on outliers Be obvious to interpret

18 In addition DCG has further advantages
Weights down the gain found later Model user persistence

19 Outline Introduction Cumulated gain-based measurements
Case study : comparison of some TREC-7 results at different relevance levels Discussion

20 Data source TREC-7 50 queries from topic statements
51800 document or 1.9 GB data we used result lists for 20 topics by five participants from the TREC-7 ad hoc manual track

21 Relevance judgments The new judgment is reliable
New judgment is stricter

22 Cumulated gain (a) Binary weighting (b) Nonbinary weighting

23 Discounting gain

24 Normalized (D)CG Vectors and Statistical Testing

25 Normalized (D)CG Vectors and Statistical Testing

26 About the case study D 1 2 3 4 5 6 7 8 9 10 G For example: So:
Ideal=<3,3,3,2,2,1,1,1,0,0> A=<2,3,2,1,3,…> D 1 2 3 4 5 6 7 8 9 10 G

27 Outline Introduction Cumulated gain-based measurements
Case study : comparison of some TREC-7 results at different relevance levels Discussion

28 Several parameters Last Rank Considered Gain Values Discounting Factor

29 Limitations Don’t take order effects on relevance judgments or document overlap into account Deal with a single dimension only Be unable to handle dynamic changes

30 Benefites Take the degree of document relevance into account
Model user persistence


Download ppt "Cumulated Gain-Based Evaluation of IR Techniques"

Similar presentations


Ads by Google