INF 141: Information Retrieval

Slides:



Advertisements
Similar presentations
Introduction to Information Retrieval
Advertisements

Information Retrieval and Organisation Chapter 12 Language Models for Information Retrieval Dell Zhang Birkbeck, University of London.
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Language Models Naama Kraus (Modified by Amit Gross) Slides are based on Introduction to Information Retrieval Book by Manning, Raghavan and Schütze.
Chapter 5: Introduction to Information Retrieval
1 Language Models for TR (Lecture for CS410-CXZ Text Info Systems) Feb. 25, 2011 ChengXiang Zhai Department of Computer Science University of Illinois,
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
CpSc 881: Information Retrieval
Information Retrieval in Practice
Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.
Evaluating Search Engine
Chapter 7 Retrieval Models.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 12: Language Models for IR.
Introduction to Information Retrieval Introduction to Information Retrieval Hinrich Schütze and Christina Lioma Lecture 11: Probabilistic Information Retrieval.
1 CS 430 / INFO 430 Information Retrieval Lecture 12 Probabilistic Information Retrieval.
Language Models for TR Rong Jin Department of Computer Science and Engineering Michigan State University.
Evaluating the Performance of IR Sytems
Retrieval Models II Vector Space, Probabilistic.  Allan, Ballesteros, Croft, and/or Turtle Properties of Inner Product The inner product is unbounded.
Language Modeling Approaches for Information Retrieval Rong Jin.
Evaluation of Image Retrieval Results Relevant: images which meet user’s information need Irrelevant: images which don’t meet user’s information need Query:
Chapter 7 Retrieval Models.
Multi-Style Language Model for Web Scale Information Retrieval Kuansan Wang, Xiaolong Li and Jianfeng Gao SIGIR 2010 Min-Hsuan Lai Department of Computer.
Improved search for Socially Annotated Data Authors: Nikos Sarkas, Gautam Das, Nick Koudas Presented by: Amanda Cohen Mostafavi.
Evaluating Search Engines in chapter 8 of the book Search Engines Information Retrieval in Practice Hongfei Yan.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Presented by Chen Yi-Ting.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.
Chapter 6: Information Retrieval and Web Search
Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.
Lecture 1: Overview of IR Maya Ramanath. Who hasn’t used Google? Why did Google return these results first ? Can we improve on it? Is this a good result.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Chapter 23: Probabilistic Language Models April 13, 2004.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Performance Measures. Why to Conduct Performance Evaluation? 2 n Evaluation is the key to building effective & efficient IR (information retrieval) systems.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
CpSc 881: Information Retrieval. 2 Using language models (LMs) for IR ❶ LM = language model ❷ We view the document as a generative model that generates.
A Generation Model to Unify Topic Relevance and Lexicon-based Sentiment for Opinion Retrieval Min Zhang, Xinyao Ye Tsinghua University SIGIR
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 10 Evaluation.
Information Retrieval and Web Search IR models: Vector Space Model Term Weighting Approaches Instructor: Rada Mihalcea.
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008 Annotations by Michael L. Nelson.
Chapter 7 Retrieval Models 1.  Provide a mathematical framework for defining the search process  Includes explanation of assumptions  Basis of many.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 15: Text Classification & Naive Bayes 1.
A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval Chengxiang Zhai, John Lafferty School of Computer Science Carnegie.
Language Modeling Again So are we smooth now? Courtesy of Chris Jordan.
Introduction to Information Retrieval Probabilistic Information Retrieval Chapter 11 1.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 14: Language Models for IR.
Bayesian Extension to the Language Model for Ad Hoc Information Retrieval Hugo Zaragoza, Djoerd Hiemstra, Michael Tipping Microsoft Research Cambridge,
1 Probabilistic Models for Ranking Some of these slides are based on Stanford IR Course slides at
IR 6 Scoring, term weighting and the vector space model.
Sampath Jayarathna Cal Poly Pomona
Plan for Today’s Lecture(s)
Lecture 13: Language Models for IR
Evaluation of IR Systems
CSCI 5417 Information Retrieval Systems Jim Martin
Lecture 15: Text Classification & Naive Bayes
Lecture 10 Evaluation.
Relevance Feedback Hongning Wang
Language Models for Information Retrieval
Lecture 6 Evaluation.
Presented by Wen-Hung Tsai Speech Lab, CSIE, NTNU 2005/07/13
Language Model Approach to IR
CS 4501: Information Retrieval
Cumulated Gain-Based Evaluation of IR Techniques
CS590I: Information Retrieval
Retrieval Utilities Relevance feedback Clustering
Conceptual grounding Nisheeth 26th March 2019.
Language Models for TR Rong Jin
Presentation transcript:

INF 141: Information Retrieval Discussion Session Week 8 – Winter 2011 TA: Sara Javanmardi

Evaluation Evaluation is key to building effective and efficient search engines measurement usually carried out in controlled laboratory experiments online testing can also be done Effectiveness, efficiency and cost are related e.g., if we want a particular level of effectiveness and efficiency, this will determine the cost of the system configuration efficiency and cost targets may impact effectiveness

Evaluation Precision Recall NDCG

Effectiveness Measures A is set of relevant documents, B is set of retrieved documents

Problems Users look at only top part of ranked results. Precision at rank p (p =10) Problem: Order of the ranked results is not important.

Discounted Cumulative Gain Popular measure for evaluating web search and related tasks Two assumptions: Highly relevant documents are more useful than marginally relevant document the lower the ranked position of a relevant document, the less useful it is for the user, since it is less likely to be examined

Discounted Cumulative Gain Uses graded relevance as a measure of the usefulness, or gain, from examining a document Gain is accumulated starting at the top of the ranking and may be reduced, or discounted, at lower ranks Typical discount is 1/log (rank) With base 2, the discount at rank 4 is 1/2, and at rank 8 it is 1/3

Discounted Cumulative Gain DCG is the total gain accumulated at a particular rank p: Alternative formulation: used by some web search companies emphasis on retrieving highly relevant documents

DCG Example 10 ranked documents judged on 0-3 relevance scale: 3, 2, 3, 0, 0, 1, 2, 2, 3, 0 discounted gain: 3, 2/1, 3/1.59, 0, 0, 1/2.59, 2/2.81, 2/3, 3/3.17, 0 = 3, 2, 1.89, 0, 0, 0.39, 0.71, 0.67, 0.95, 0 DCG: 3, 5, 6.89, 6.89, 6.89, 7.28, 7.99, 8.66, 9.61, 9.61

Normalized DCG DCG numbers are averaged across a set of queries at specific rank values e.g., DCG at rank 5 is 6.89 and at rank 10 is 9.61 DCG values are often normalized by comparing the DCG at each rank with the DCG value for the perfect ranking makes averaging easier for queries with different numbers of relevant documents

NDCG Example Perfect ranking: ideal DCG values: 3, 3, 3, 2, 2, 2, 1, 0, 0, 0 ideal DCG values: 3, 6, 7.89, 8.89, 9.75, 10.52, 10.88, 10.88, 10.88, 10 NDCG values (divide actual by ideal): 1, 0.83, 0.87, 0.76, 0.71, 0.69, 0.73, 0.8, 0.88, 0.88 NDCG £ 1 at any rank position

fair fair Good

Retrieval Model Overview Older models Boolean retrieval Vector Space model Probabilistic Models BM25 Language models Combining evidence Inference networks Learning to Rank

Assignment 5 Read chapter 7 Cosine Similarity: Creating Word Cloud http://www.search-engines-book.com/slides/ Cosine Similarity: http://www.miislita.com/information-retrieval-tutorial/cosine-similarity-tutorial.html Creating Word Cloud http://worditout.com/ or http://www.wordle.net/ Word Proximity http://nlp.stanford.edu/IR-book/html/htmledition/query-term-proximity-1.html

Vector Space Model 3-d pictures useful, but can be misleading for high-dimensional space

Vector Space Model Documents ranked by distance between points representing query and documents Similarity measure more common than a distance or dissimilarity measure e.g. Cosine correlation

Similarity Calculation Consider two documents D1, D2 and a query Q D1 = (0.5, 0.8, 0.3), D2 = (0.9, 0.4, 0.2), Q = (1.5, 1.0, 0)

Term Weights tf.idf weight Term frequency weight measures importance in document: Inverse document frequency measures importance in collection: Some heuristic modifications

Language Model Unigram language model N-gram language model probability distribution over the words in a language generation of text consists of pulling words out of a “bucket” according to the probability distribution and replacing them N-gram language model some applications use bigram and trigram language models where probabilities depend on previous words

LMs for Retrieval 3 possibilities: Models of topical relevance probability of generating the query text from a document language model probability of generating the document text from a query language model comparing the language models representing the query and document topics Models of topical relevance

Query-Likelihood Model Rank documents by the probability that the query could be generated by the document model (i.e. same topic) Given query, start with P(D|Q) Using Bayes’ Rule Assuming prior is uniform, unigram model

Estimating Probabilities Obvious estimate for unigram probabilities is Maximum likelihood estimate makes the observed value of fqi;D most likely If query words are missing from document, score will be zero Missing 1 out of 4 query words same as missing 3 out of 4

Smoothing Document texts are a sample from the language model Missing words should not have zero probability of occurring Smoothing is a technique for estimating probabilities for missing (or unseen) words lower (or discount) the probability estimates for words that are seen in the document text assign that “left-over” probability to the estimates for the words that are not seen in the text

Estimating Probabilities Estimate for unseen words is αDP(qi|C) P(qi|C) is the probability for query word i in the collection language model for collection C (background probability) αD is a parameter Estimate for words that occur is (1 − αD) P(qi|D) + αD P(qi|C) Different forms of estimation come from different αD

Jelinek-Mercer Smoothing αD is a constant, λ Gives estimate of Ranking score Use logs for convenience accuracy problems multiplying small numbers

Where is tf.idf Weight? - proportional to the term frequency, inversely proportional to the collection frequency

Dirichlet Smoothing αD depends on document length Gives probability estimation of and document score

Query Likelihood Example For the term “president” fqi,D = 15, cqi = 160,000 For the term “lincoln” fqi,D = 25, cqi = 2,400 number of word occurrences in the document |d| is assumed to be 1,800 number of word occurrences in the collection is 109 500,000 documents times an average of 2,000 words μ = 2,000

Query Likelihood Example Negative number because summing logs of small numbers

Query Likelihood Example