Information Retrieval Review

Information Retrieval Review
LBSC 796/INFM 718R 1

Structure of IR Systems
IR process model System architecture Information needs Visceral, conscious, formalized, compromised Utility vs. relevance Known item vs. ad hoc search

Supporting the Search Process
Source Selection Query Formulation IR System Search Query Selection Ranked List Indexing Index Examination Document Acquisition Collection Delivery Document

Relevance Relevance relates a topic and a document
Duplicates are equally relevant, by definition Constant over time and across users Pertinence relates a task and a document Accounts for quality, complexity, language, … Utility relates a user and a document Accounts for prior knowledge

Taylor’s Model of Question Formation
Q Visceral Need End-user Search Q Conscious Need Intermediated Search Q Formalized Need Q Compromised Need (Query)

Evidence from Content and Ranked Retrieval
Inverted indexing Postings, postings file Bag of terms Segmentation, phrases, stemming, stopwords Boolean retrieval Vector space ranked retrieval TF, IDF, length normalization, BM25 Blind relevance feedback

An “Inverted Index” Term Index Postings Term Doc 1 Doc 2 Doc 3 Doc 4
Postings List AI aid 1 1 4, 8 A AL all 1 1 1 2, 4, 6 BA back 1 1 1 1, 3, 7 B BR brown 1 1 1 1 1, 3, 5, 7 Postings List C come 1 1 1 1 2, 4, 6, 8 D dog 1 1 3, 5 F fox 1 1 1 3, 5, 7 G good 1 1 1 1 2, 4, 6, 8 Postings List J jump 1 3 L lazy 1 1 1 1 1, 3, 5, 7 M men 1 1 1 2, 4, 8 N now 1 1 1 2, 6, 8 O over 1 1 1 1 1 1, 3, 5, 7, 8 P party 1 1 6, 8 Q quick 1 1 1, 3 TH their 1 1 1 1, 5, 7 T TI time 1 1 1 2, 4, 6 15

A Partial Solution: TF*IDF
High TF is evidence of meaning Low DF is evidence of term importance Equivalently high “IDF” Multiply them to get a “term weight” Add up the weights for each query term

Cosine Normalization Example
1 2 3 4 1 2 3 4 1 2 3 4 complicated 5 2 0.301 1.51 0.60 0.57 0.69 contaminated 4 1 3 0.125 0.50 0.13 0.38 0.29 0.13 0.14 fallout 5 4 3 0.125 0.63 0.50 0.38 0.37 0.19 0.44 information 6 3 3 2 0.000 interesting 1 0.602 0.60 0.62 nuclear 3 7 0.301 0.90 2.11 0.53 0.79 retrieval 6 1 4 0.125 0.75 0.13 0.50 0.77 0.05 0.57 siberia 2 0.602 1.20 0.71 Length 1.70 0.97 2.67 0.87 query: contaminated retrieval, Result: 2, 4, 1, 3 (compare to 2, 3, 1, 4)

Interaction Query formulation vs. Query by example Summarization
Indicative vs. informative Clustering Visualization Projection, starfield, contour maps

Evaluation Criteria Measures of effectiveness User studies
Effectiveness, efficiency, usability Measures of effectiveness Recall Precision F-measure Mean Average Precision User studies

Set-Based Effectiveness Measures
Precision How much of what was found is relevant? Often of interest, particularly for interactive searching Recall How much of what is relevant was found? Particularly important for law, patents, and medicine

Accuracy and exhaustiveness
Space of all documents Relevant + Retrieved Relevant Retrieved Not Relevant + Not Retrieved

Mean Average Precision
Average of precision at each retrieved relevant document Relevant documents not retrieved contribute zero to score Hits 1-10 Precision 1/1 1/2 1/3 1/4 2/5 3/6 3/7 4/8 4/9 4/10 Hits 11-20 Precision 5/11 5/12 5/13 5/14 5/15 6/16 6/17 6/18 6/19 4/20 Assume total of 14 relevant documents: 8 relevant documents not retrieved contribute eight zeros MAP = .2307 = relevant document

Blair and Maron (1985) A classic study of retrieval effectiveness
Earlier studies used unrealistically small collections Studied an archive of documents for a lawsuit 40,000 documents, ~350,000 pages of text 40 different queries Used IBM’s STAIRS full-text system Approach: Lawyers wanted at least 75% of all relevant documents Precision and recall evaluated only after the lawyers were satisfied with the results David C. Blair and M. E. Maron. (1984) An Evaluation of Retrieval Effectiveness for a Full-Text Document-Retrieval System. Communications of the ACM, 28(3),

Blair and Maron’s Results
Mean precision: 79% Mean recall: 20% (!!) Why recall was low? Users can’t anticipate terms used in relevant documents Differing technical terminology Slang, misspellings Other findings: Searches by both lawyers had similar performance Lawyer’s recall was not much different from paralegal’s “accident” might be referred to as “event”, “incident”, “situation”, “problem,” …

Web Search Crawling PageRank Anchor text Deep Web
i.e., database-generated content

Evidence from Behavior
Implicit feedback Privacy risks Recommender systems

Evidence from Metadata
Standards e.g., Dublin Core Controlled vocabulary Text classification Information extraction

Filtering Retrieval Filtering
Information needs differ for stable collection Filtering Collection differs for stable information needs

Multimedia IR Image retrieval Video: Motion detection
Color histograms Video: Motion detection Camera, object Video: Shot structure Boundary detection, classification Video: OCR Closed caption, on screen caption, scene text

Information Retrieval Review

Similar presentations

Presentation on theme: "Information Retrieval Review"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Information Retrieval Review

Similar presentations

Presentation on theme: "Information Retrieval Review"— Presentation transcript:

Similar presentations

About project

Feedback