Presentation is loading. Please wait.

Presentation is loading. Please wait.

Galago for Information Retrieval. Galago Java based information retrieval toolkit Download both Binary and Source.

Similar presentations


Presentation on theme: "Galago for Information Retrieval. Galago Java based information retrieval toolkit Download both Binary and Source."— Presentation transcript:

1 Galago for Information Retrieval

2 Galago Java based information retrieval toolkit http://www.galagosearch.org/index.html Download both Binary and Source at: http://code.google.com/p/galagosearch/downloads/list

3 Tasks Document Index Query Parse Evaluation

4 Tasks - Index Hello I love information retrieval so much! Doc-001: hello; i; love; information…… Index with words Doc-001: hello; i; love; informate…… Index with stemmed words Doc-001: html; head; body…… Index with extents galago.bat build D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus

5 How galago index work Index Folder documentLengths documentNames Parts (folder) postings stemmedPostings extents

6 How galago index work postings stemmedPostings extents galago.bat dump-keys galago.bat dump-index

7 Tasks - Search Text Folder: D:\test_collection\Wiki\wiki-small.corpus Index Folder: D:\test_collection\Wiki\index galago.bat search D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus http://localhost:XXXX

8 IR experiment Information retrieval tasks + Evaluation (TREC style) Query: 1 test query Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef 1 -14.29629326 galago 1 Q0 KULA-LP_fb81 2 -15.90760040 galago 1 Q0 KKJK_7f20 3 -15.92886543 galago 1 Q0 WNUZ_d533 4 -15.93414688 galago 1 Q0 KZEN_8c0c 5 -15.94256783 galago Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1 Eval Result: num_ret 1 100 num_rel 1 5 num_rel_ret 1 3 map 1 0.2667 ndcg 1 0.4622 ndcg15 1 0.4622 R-prec 1 0.4000

9 IR experiment galago.bat batch-search --index=D:\test_collection\Wiki\index --count=100 D:\test_collection\Wiki\test.query Batch search task, send multiple queries to Galago: Query: 1 test query

10 IR experiment Save your result in a file, e.g. wiki.query.eval Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef 1 -14.29629326 galago 1 Q0 KULA-LP_fb81 2 -15.90760040 galago 1 Q0 KKJK_7f20 3 -15.92886543 galago 1 Q0 WNUZ_d533 4 -15.93414688 galago 1 Q0 KZEN_8c0c 5 -15.94256783 galago

11 IR experiment You can evaluate your retrieve and ranking performance: Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1 You need a judgment file e.g. wiki.query.judgments galago.bat eval D:\test_collection\Wiki\wiki.query.eval D:\test_collection\Wiki\wiki.query.judgments

12 Advanced, read and update the source code! You can make it better by updating org.galagosearch.core.tools.App.java And ……


Download ppt "Galago for Information Retrieval. Galago Java based information retrieval toolkit Download both Binary and Source."

Similar presentations


Ads by Google