Download presentation
1
Galago for Information Retrieval
2
Galago Java based information retrieval toolkit
Download both Binary and Source at:
3
Tasks Document Query Index Parse Parse Evaluation
4
Tasks - Index <Html> <Head> Hello </Head> <Body> I love information retrieval so much! </Body> </Html> Doc-001: hello; i; love; information…… Index with words Doc-001: hello; i; love; informate…… Index with stemmed words Doc-001: html; head; body…… Index with extents galago.bat build D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus
5
How galago index work Index Folder documentLengths documentNames
postings Parts (folder) stemmedPostings extents
6
How galago index work galago.bat dump-keys galago.bat dump-index
postings galago.bat dump-keys galago.bat dump-index stemmedPostings extents
7
Tasks - Search Text Folder: D:\test_collection\Wiki\wiki-small.corpus Index Folder: D:\test_collection\Wiki\index galago.bat search D:\test_collection\Wiki\index D:\test_collection\Wiki\wiki-small.corpus
8
IR experiment Information retrieval tasks + Evaluation (TREC style)
Query: <parameters> <query> <number>1</number> <text> test query </text> </query> </parameters> Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef galago 1 Q0 KULA-LP_fb galago 1 Q0 KKJK_7f galago 1 Q0 WNUZ_d galago 1 Q0 KZEN_8c0c galago Eval Result: num_ret num_rel num_rel_ret map ndcg ndcg R-prec Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1
9
IR experiment Batch search task, send multiple queries to Galago:
galago.bat batch-search index=D:\test_collection\Wiki\index count= D:\test_collection\Wiki\test.query Query: <parameters> <query> <number>1</number> <text> test query </text> </query> </parameters>
10
IR experiment Save your result in a file, e.g. wiki.query.eval
Retrieved and ranking result: 1 Q0 Zend_Framework_a5ef galago 1 Q0 KULA-LP_fb galago 1 Q0 KKJK_7f galago 1 Q0 WNUZ_d galago 1 Q0 KZEN_8c0c galago
11
IR experiment You need a judgment file e.g. wiki.query.judgments
Judgements: 1 Q0 KULA-LP_fb81 1 1 Q0 WNUZ_d533 1 1 Q0 KBON_0027 1 1 Q0 Nicky_Wroe_5d39 1 1 Q0 Chemult_(Amtrak_station)_ac76 1 You can evaluate your retrieve and ranking performance: galago.bat eval D:\test_collection\Wiki\wiki.query.eval D:\test_collection\Wiki\wiki.query.judgments
12
Advanced, read and update the source code!
You can make it better by updating org.galagosearch.core.tools.App.java And ……
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.