Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 (http://clef-qa.itc.it) Bridging Languages for Question Answering: DIOGENE at CLEF-2003.

Similar presentations


Presentation on theme: "CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 (http://clef-qa.itc.it) Bridging Languages for Question Answering: DIOGENE at CLEF-2003."— Presentation transcript:

1 CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 (http://clef-qa.itc.it) Bridging Languages for Question Answering: DIOGENE at CLEF-2003 Matteo Negri, Hristo Tanev and Bernardo Magnini ITC-irst, Centro per la Ricerca Scientifica e Tecnologica, Trento – Italy {negri, tanev, magnini}@itc.it

2 Outline DIOGENE QA English system Porting DIOGENE to monolingual Italian Porting DIOGENE to the bi-lingual task Results at CLEF-2003 Question Answering for Italian

3 System Architecture Search ANSWER Answer Extraction Answer Validation and Ranking Named Entities Recognition Candidate Answer Selection Query Composition Search Engine Keyword Expansion Multiwords Recognition Answer Type Identification Tokenization and POS tagging QUESTION Question Processing Keywords Extraction Keywords Translation Query Reformulation Document collection WEBWordNet M B

4 System Architecture Search ANSWER Answer Extraction Answer Validation and Ranking Named Entities Recognition Candidate Answer Selection Query Composition Search Engine Keyword Expansion Multiwords Recognition Answer Type Identification Tokenization and POS tagging QUESTION Question Processing Keywords Extraction Keywords Translation Query Reformulation Document collection WEBWordNet M B

5 Porting from English to Italian Named Entities Recognition Keyword Expansion Multiwords Recognition Answer Type Identification Tokenization and POS tagging Keywords Extraction WordNet POS tagger for Italian: already available List of Italian multiwords (about 5000) extracted from an electronic dictionary: 1p/m Rules for Italian (about 250): 1 p/m Rules for Italian (e.g. verb expansion) : 0.5 p/m Rules for Italian (about 300): 1 p/m WordNet for Italian aligned with the English WordNet: already available

6 Porting to the Italian-English task Keyword Translation Multiwords Recognition Answer Type Identification Tokenization and POS tagging Keywords Extraction Already available from the monolingual Italian task Word by word translation based on Italian/English aligned resources; sense ambiguities are addressed with statistical techniques.

7 Keyword Translation Method proposed by [Federico and Bertoldi, 2002] Bilingual dictionary, target corpus and search engine Given a sequence of Italian keywords: 1.Extract all possible translations from the Italian-English dictionary and from the aligned wordnets 2.Estimate the probability of each translation sequence against the English target corpus

8 Results MONOLINGUAL ITALIAN GROUPTASKRUN NAMEMRR No. of Q. with at least one right answer NIL Questions strictlenientstrictlenientreturnedcorrectly returned ITC-irst Exact answer irstex031mi.422.4429710142 ITC-irst 50 bytes answer irstst032mi.449.4719910452 ITC-irst Exact answer irstex031bi.322.3347781496 irstex032bi.393.4009092285 BILINGUAL ITALIAN

9 Situation for Italian Tasks Two groups from Italy have participated at TREC-2002 QA Four groups expressed their interest in QA for Italian Just one registered and took part at CLEF-2003 Resources available for Italian: Wordnets (aligned with the English) Named Entities Recognition


Download ppt "CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 (http://clef-qa.itc.it) Bridging Languages for Question Answering: DIOGENE at CLEF-2003."

Similar presentations


Ads by Google