Presentation is loading. Please wait.

Presentation is loading. Please wait.

Skövde, Jan 19. -2004Information Access: Leif Grönqvist1 Systematic Evaluation of Swedish IR Systems using a Relevance Judged Document Collection Leif.

Similar presentations


Presentation on theme: "Skövde, Jan 19. -2004Information Access: Leif Grönqvist1 Systematic Evaluation of Swedish IR Systems using a Relevance Judged Document Collection Leif."— Presentation transcript:

1 Skövde, Jan 19. -2004Information Access: Leif Grönqvist1 Systematic Evaluation of Swedish IR Systems using a Relevance Judged Document Collection Leif Grönqvist GSLT, MSI@VxU, Ling@GU For the GSLT course: Information Access, 2003

2 Skövde, Jan 19. -2004Information Access: Leif Grönqvist2 Overview Introduction Design of testbed –Documents –Topics –Relevance judgments Experiment setup Evaluation Conclusion

3 Skövde, Jan 19. -2004Information Access: Leif Grönqvist3 Introduction The classical IR task: find an ordered list of documents relevant to a user query Difficult to evaluate –Relevance is subjective –Different depending on context –But very important! Test collections for Swedish not very common: CLEF, ?

4 Skövde, Jan 19. -2004Information Access: Leif Grönqvist4 Why Swedish? Very different from English: Compounds without spaces “New” letters (åäö) Complex morphology Different tokenization Other stop words

5 Skövde, Jan 19. -2004Information Access: Leif Grönqvist5 The test collection Documents Topics Relevance judgments

6 Skövde, Jan 19. -2004Information Access: Leif Grönqvist6 Document collection Newspaper articles from GP and HD 161 000 articles, 40 MTokens Good to have more than one newspaper: –Same content, different author (not always) 10% of my newspaper article collection Copyright is a problem

7 Skövde, Jan 19. -2004Information Access: Leif Grönqvist7 Topics Borrowed from CLEF 52/90, but not the most difficult Examples: –Filmer av bröderna Kaurismäki. Description: Sök efter information om filmer som regisserats av någon av de båda bröderna Aki och Mika Kaurismäki. Narrative: Relevanta dokument namnger en eller flera titlar på filmer som regisserats av Aki eller Mika Kaurismäki. –Finlands första EU-kommissionär Description: Vem utsågs att vara den första EU- kommissionären för Finland i Europeiska unionen? Narrative: Ange namnet på Finlands första EU-kommissionär. Relevanta dokument kan också nämna sakområdena för den nya kommissionärens uppdrag.

8 Skövde, Jan 19. -2004Information Access: Leif Grönqvist8 Relevance judgments Only a subset for each topic –Selected by earlier experiments –Similar approach to TREC and CLEF 100 documents for 5 strategies: –100  N  500 –Important to include relevant and irrelevant documents A scale of relevance proposed by Sormonen: Irrelevant (0)  Marginally relevant (1)  Fairly relevant (2)  Highly relevant (3) Manually annotated

9 Skövde, Jan 19. -2004Information Access: Leif Grönqvist9 Statistics Some difficult topics got very few relevant documents

10 Skövde, Jan 19. -2004Information Access: Leif Grönqvist10 Statistics per relevance category

11 Skövde, Jan 19. -2004Information Access: Leif Grönqvist11 The InQuery system Handles big document collections performs all the indexing Batch runs for many setups Standard format outputs to fit evaluation presentation software

12 Skövde, Jan 19. -2004Information Access: Leif Grönqvist12 Our system setups We want to test an ordinary IR system using some common term weighting as baseline Compared to combined systems using one or many of: –The MALT tagger from Växjö: fast and therefore suitable for large IR systems –A stemmer, maybe Carlberger et. al. –A stop list: out own Tagging is not trivial to use Maybe more features added later

13 Skövde, Jan 19. -2004Information Access: Leif Grönqvist13 Evaluation metrics Recall & precision is problematic: –Ranked lists – how much better is position 1 than pos 5 and 10? –How long should the lists be? –Relevance scale – how much better is “highly relevant” than “fairly relevant” –What about the unknown documents not judged? Too many unknown leads to more manual judgments…

14 Skövde, Jan 19. -2004Information Access: Leif Grönqvist14 Conclusion A testbed for IR, but under construction 1890/9848 documents relevant to a topic We will test if stop lists, stemming, and/or tagging improves document search –Not just a binary relevance measure –Swedish Precision & recall is problematic

15 Skövde, Jan 19. -2004Information Access: Leif Grönqvist15 Thank you! Questions And probably suggestions


Download ppt "Skövde, Jan 19. -2004Information Access: Leif Grönqvist1 Systematic Evaluation of Swedish IR Systems using a Relevance Judged Document Collection Leif."

Similar presentations


Ads by Google