Presentation is loading. Please wait.

Presentation is loading. Please wait.

INF 141: Information Retrieval

Similar presentations


Presentation on theme: "INF 141: Information Retrieval"— Presentation transcript:

1 INF 141: Information Retrieval
Discussion Session Week 8 – Winter 2011 TA: Sara Javanmardi

2 Evaluation Evaluation is key to building effective and efficient search engines measurement usually carried out in controlled laboratory experiments online testing can also be done Effectiveness, efficiency and cost are related e.g., if we want a particular level of effectiveness and efficiency, this will determine the cost of the system configuration efficiency and cost targets may impact effectiveness

3 Evaluation Precision Recall NDCG

4 Effectiveness Measures
A is set of relevant documents, B is set of retrieved documents

5 Problems Users look at only top part of ranked results. Precision at rank p (p =10) Problem: Order of the ranked results is not important.

6 Discounted Cumulative Gain
Popular measure for evaluating web search and related tasks Two assumptions: Highly relevant documents are more useful than marginally relevant document the lower the ranked position of a relevant document, the less useful it is for the user, since it is less likely to be examined

7 Discounted Cumulative Gain
Uses graded relevance as a measure of the usefulness, or gain, from examining a document Gain is accumulated starting at the top of the ranking and may be reduced, or discounted, at lower ranks Typical discount is 1/log (rank) With base 2, the discount at rank 4 is 1/2, and at rank 8 it is 1/3

8 Discounted Cumulative Gain
DCG is the total gain accumulated at a particular rank p: Alternative formulation: used by some web search companies emphasis on retrieving highly relevant documents

9 DCG Example 10 ranked documents judged on 0-3 relevance scale:
3, 2, 3, 0, 0, 1, 2, 2, 3, 0 discounted gain: 3, 2/1, 3/1.59, 0, 0, 1/2.59, 2/2.81, 2/3, 3/3.17, 0 = 3, 2, 1.89, 0, 0, 0.39, 0.71, 0.67, 0.95, 0 DCG: 3, 5, 6.89, 6.89, 6.89, 7.28, 7.99, 8.66, 9.61, 9.61

10 Normalized DCG DCG numbers are averaged across a set of queries at specific rank values e.g., DCG at rank 5 is 6.89 and at rank 10 is 9.61 DCG values are often normalized by comparing the DCG at each rank with the DCG value for the perfect ranking makes averaging easier for queries with different numbers of relevant documents

11 NDCG Example Perfect ranking: ideal DCG values:
3, 3, 3, 2, 2, 2, 1, 0, 0, 0 ideal DCG values: 3, 6, 7.89, 8.89, 9.75, 10.52, 10.88, 10.88, 10.88, 10 NDCG values (divide actual by ideal): 1, 0.83, 0.87, 0.76, 0.71, 0.69, 0.73, 0.8, 0.88, 0.88 NDCG £ 1 at any rank position

12 fair fair Good

13 Retrieval Model Overview
Older models Boolean retrieval Vector Space model Probabilistic Models BM25 Language models Combining evidence Inference networks Learning to Rank

14 Assignment 5 Read chapter 7 Cosine Similarity: Creating Word Cloud
Cosine Similarity: Creating Word Cloud or Word Proximity

15 Vector Space Model 3-d pictures useful, but can be misleading for high-dimensional space

16 Vector Space Model Documents ranked by distance between points representing query and documents Similarity measure more common than a distance or dissimilarity measure e.g. Cosine correlation

17 Similarity Calculation
Consider two documents D1, D2 and a query Q D1 = (0.5, 0.8, 0.3), D2 = (0.9, 0.4, 0.2), Q = (1.5, 1.0, 0)

18 Term Weights tf.idf weight
Term frequency weight measures importance in document: Inverse document frequency measures importance in collection: Some heuristic modifications

19 Language Model Unigram language model N-gram language model
probability distribution over the words in a language generation of text consists of pulling words out of a “bucket” according to the probability distribution and replacing them N-gram language model some applications use bigram and trigram language models where probabilities depend on previous words

20 LMs for Retrieval 3 possibilities: Models of topical relevance
probability of generating the query text from a document language model probability of generating the document text from a query language model comparing the language models representing the query and document topics Models of topical relevance

21 Query-Likelihood Model
Rank documents by the probability that the query could be generated by the document model (i.e. same topic) Given query, start with P(D|Q) Using Bayes’ Rule Assuming prior is uniform, unigram model

22 Estimating Probabilities
Obvious estimate for unigram probabilities is Maximum likelihood estimate makes the observed value of fqi;D most likely If query words are missing from document, score will be zero Missing 1 out of 4 query words same as missing 3 out of 4

23 Smoothing Document texts are a sample from the language model
Missing words should not have zero probability of occurring Smoothing is a technique for estimating probabilities for missing (or unseen) words lower (or discount) the probability estimates for words that are seen in the document text assign that “left-over” probability to the estimates for the words that are not seen in the text

24 Estimating Probabilities
Estimate for unseen words is αDP(qi|C) P(qi|C) is the probability for query word i in the collection language model for collection C (background probability) αD is a parameter Estimate for words that occur is (1 − αD) P(qi|D) + αD P(qi|C) Different forms of estimation come from different αD

25 Jelinek-Mercer Smoothing
αD is a constant, λ Gives estimate of Ranking score Use logs for convenience accuracy problems multiplying small numbers

26 Where is tf.idf Weight? - proportional to the term frequency, inversely proportional to the collection frequency

27 Dirichlet Smoothing αD depends on document length
Gives probability estimation of and document score

28 Query Likelihood Example
For the term “president” fqi,D = 15, cqi = 160,000 For the term “lincoln” fqi,D = 25, cqi = 2,400 number of word occurrences in the document |d| is assumed to be 1,800 number of word occurrences in the collection is 109 500,000 documents times an average of 2,000 words μ = 2,000

29 Query Likelihood Example
Negative number because summing logs of small numbers

30 Query Likelihood Example


Download ppt "INF 141: Information Retrieval"

Similar presentations


Ads by Google