Presentation is loading. Please wait.

Presentation is loading. Please wait.

Galago for Information Retrieval

Similar presentations


Presentation on theme: "Galago for Information Retrieval"— Presentation transcript:

1 Galago for Information Retrieval

2 Galago Java based information retrieval toolkit
Download both Binary and Source at:

3 Tasks Document Query Index Parse Parse Evaluation

4 Tasks - Index <Html> <Head> Hello </Head> <Body> I love information retrieval so much! </Body> </Html> Doc-001: hello; i; love; information…… Index with words Doc-001: hello; i; love; informate…… Index with stemmed words Doc-001: html; head; body…… Index with extents galago.bat build D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus

5 How galago index work Index Folder documentLengths documentNames
postings Parts (folder) stemmedPostings extents

6 How galago index work galago.bat dump-keys galago.bat dump-index
postings galago.bat dump-keys galago.bat dump-index stemmedPostings extents

7 Tasks - Search Text Folder: D:\test_collection\Wiki\wiki-small.corpus Index Folder: D:\test_collection\Wiki\index galago.bat search D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus

8 IR experiment Information retrieval tasks + Evaluation (TREC style)
Query: <parameters> <query> <number>1</number> <text> test query </text> </query> </parameters> Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef galago 1 Q0 KULA-LP_fb galago 1 Q0 KKJK_7f galago 1 Q0 WNUZ_d galago 1 Q0 KZEN_8c0c galago Eval Result: num_ret num_rel num_rel_ret map ndcg ndcg R-prec Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1

9 IR experiment Batch search task, send multiple queries to Galago:
galago.bat batch-search index=D:\test_collection\Wiki\index count= D:\test_collection\Wiki\test.query Query: <parameters> <query> <number>1</number> <text> test query </text> </query> </parameters>

10 IR experiment Save your result in a file, e.g. wiki.query.eval
Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef galago 1 Q0 KULA-LP_fb galago 1 Q0 KKJK_7f galago 1 Q0 WNUZ_d galago 1 Q0 KZEN_8c0c galago

11 IR experiment You need a judgment file e.g. wiki.query.judgments
Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1 You can evaluate your retrieve and ranking performance: galago.bat eval D:\test_collection\Wiki\wiki.query.eval D:\test_collection\Wiki\wiki.query.judgments

12 Advanced, read and update the source code!
You can make it better by updating org.galagosearch.core.tools.App.java And ……


Download ppt "Galago for Information Retrieval"

Similar presentations


Ads by Google