Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tutorial#3.

Similar presentations


Presentation on theme: "Tutorial#3."— Presentation transcript:

1 Tutorial#3

2 Retrieval models Retrieval models match query with documents to:
separate documents into relevant and non-relevant class rank the documents according to the relevance. Boolean model Vector space model (VSM) Probabilistic models

3 Boolean model Boolean model is most common exact-match model
queries are logic expressions with document features as operands In pure Boolean model, retrieved documents are not ranked.

4 Example D7 OR D1,D2,D5 AND D2,D4,D5,D6,D8 D7 OR D2,D5

5

6

7 Vector space model (VSM)
Documents and queries are represented as vectors. dj = (w1,j,w2,j,...,wt,j) q = (w1,q,w2,q,...,wt,q) Each dimension corresponds to a separate term. If a term occurs in the document, its value in the vector is non-zero.

8 K5 K4 K3 K2 K1 K0 Q0 Q1 Q2 Q3 Q4 K5 K4 K3 K2 K1 K0 D0 D1 D2 D3 D4

9 Vector space model (VSM)
Several different ways of computing these values, also known as (term) weights, have been developed. One of the best known schemes is (tf-idf) weighting:

10 (tf-idf) weighting

11 Vector space model (VSM)

12 Example documents: D0:'How to Bake Bread Without Recipes',
D1:'The Classic Art of Viennese Pastry', D2:'Numerical Recipes: The Art of Scientific Computing', D3:'Breads, Pastries, Pies and Cakes : Quantity Baking Recipes', D4:'Pastry: A Book of Best French Recipe‘ Keywords : ['bak','recipe','bread','cake','pastr','pie']

13 will generate a matrix 6 terms x 5 documents
'pie' 'pastr' 'cake' 'bread' 'recipe' 'bak' 1 D0 D1 D2 D3 D4

14 Query: "baking bread“ will generate a matrix 6 terms x 5 documents
'pie' 'pastr' 'cake' 'bread' 'recipe' 'bak' 1 D0 D1 D2 D3 D4

15 VSM Implementation VSMranker.java ranks documents for a query
Provides functions to develop different user interfaces Stand alone usage needs document and query TDMs java -cp ../java VSMranker cacm.tdm query.tdm 7 Retrieves top 7 documents for CACM queries

16 Ex#3 (solve in tutorial time)

17 References:


Download ppt "Tutorial#3."

Similar presentations


Ads by Google