Presentation is loading. Please wait.

Presentation is loading. Please wait.

(C) 2003, The University of Michigan1 Information Retrieval Handout #10 April 7, 2003.

Similar presentations


Presentation on theme: "(C) 2003, The University of Michigan1 Information Retrieval Handout #10 April 7, 2003."— Presentation transcript:

1 (C) 2003, The University of Michigan1 Information Retrieval Handout #10 April 7, 2003

2 (C) 2003, The University of Michigan2 Course Information Instructor: Dragomir R. Radev (radev@si.umich.edu) Office: 3080, West Hall Connector Phone: (734) 615-5225 Office hours: M&F 11-12 Course page: http://tangra.si.umich.edu/~radev/650/ Class meets on Mondays, 1-4 PM in 409 West Hall

3 (C) 2003, The University of Michigan3 Schedule Final projects due 04/11 Final project presentations 04/14 Final exam 04/21 2-3 essay questions, 2-3 problems

4 (C) 2003, The University of Michigan4 Language modeling

5 (C) 2003, The University of Michigan5 The problem In what order to show documents to the user?

6 (C) 2003, The University of Michigan6 Probabilistic retrieval Probabilistic retrieval [Robertson and Sparck Jones 1976] Given: query q and document d, estimate the probability that the user will find the document relevant. Assumption: the relevance depends only on the query and document representations

7 (C) 2003, The University of Michigan7 Probabilistic retrieval (cont’d) R = preferred set w i = index variables If P( R | d j ) > P ( R | d j ), then the document is relevant. prob. of randomly selecting d j from R can be ignored

8 (C) 2003, The University of Michigan8 Probabilistic retrieval (cont’d) Initial guess: Similarity:

9 (C) 2003, The University of Michigan9 Language models Each document generates a probability distribution Determine whether the query is from the same distribution as the document

10 (C) 2003, The University of Michigan10 Aspect models Hofmann Find diverse answers: e.g., queries about New Zealand from many perspectives

11 (C) 2003, The University of Michigan11 The Lemur system http://www-2.cs.cmu.edu/~lemur/ /clair4/projects/lemur-2.01/lemur/bin/ParseInQueryOp test2.sparam test2.squery /clair4/projects/lemur-2.01/lemur/bin/RetEval test2.param

12 (C) 2003, The University of Michigan12 The Lemur system more test2.squery #q1=#od2(New York); #q2=#od5(Spain Madrid); more test2.sparam outputFile = test2.query; /* result file */ more test2.query #ODN 2 LPAREN new york RPAREN #ODN 5 LPAREN spain madrid RPAREN more test2.out 1 AP891002-0295 6.14821 1 AP901009-0232 6.00223 1 LA072089-0199 5.96404 1 AP880705-0008 5.92714 1 AP880630-0272 5.92714 1 AP891010-0224 5.92515 1 AP881011-0236 5.91442 1 AP891010-0215 5.90016 1 WSJ870320-0138 5.84612 1 SJMN91-06304154 5.83062 1 AP901009-0237 5.83029 1 AP881011-0242 5.80985 1 WSJ890929-0083 5.79302 1 AP880922-0222 5.76874 1 WSJ900814-0094 5.75493 1 WSJ900725-0049 5.75351 1 WSJ870129-0039 5.72683


Download ppt "(C) 2003, The University of Michigan1 Information Retrieval Handout #10 April 7, 2003."

Similar presentations


Ads by Google