Presentation is loading. Please wait.

Presentation is loading. Please wait.

CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Why not 100% ?

Similar presentations


Presentation on theme: "CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Why not 100% ?"— Presentation transcript:

1 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Why not 100% ?

2 100% of what ? Factoid and definition questions In monolingual (the cross-lingual systems lost between 30% and 40% of accuracy) With answers in one sentence and in one document With a relative proximity between the terms of the question and the terms of the answer. But with NIL questions… CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT

3 A modular conception French Language Module Italian Language Module Portuguese Language Module Polish Language Module English Language Module Indexation engineExtraction of text engine Index Documents Visualization of Results Visualization of Results CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT

4

5 Our results in CLEF QA : CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT

6 Failures in indexation process CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Block RankNumber% 111256,0 % 2-55784,5 % 6-101893,5 % 11-100696,5 % more than 1007100,0 %

7 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Failures in identification of questions Failures in type of question : 7 (3,5% for French- French, 6,5% for English-French, 9% for Portuguese-French) Failures in pivots (not useful : 16, forgotten : 23)

8 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Failures in extraction of answers For French-French, 136 Right answers, but 23 Unjustified + Inaccurate (R + U + X = 79,5%) A lot of inaccurate answers result from failures during extraction of the answer What is a good answer ? (Atlantis – Atlantide ?)

9 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Conclusion about 100% The maximum reachable is probably between 90% and 95%, with similar questions Are NIL questions useful and realistic ? Do we need more difficult questions in the future or do we need to wait that several systems reach 90% for factoid and definition questions ?

10 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Some words about our future Present implementation in MCAST for digital libraries in Prague and in Poland Participation to French-German project QUAERO in CMSE with Exalead as leader (www.exalead.com) and with Priberam as partner for Spanish and Portuguesewww.exalead.com Time of answer : about 1 second per question, time requested : 200 ms maximum.

11 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Some words about CLEF QA (1) One answer or several, the incidence of this choice on the final results. Why keep one week delay to deliver results (we need to take into account the first real time pilot task) ? Improve quality of evaluation and respect the deadlines is probably better than add new types of questions.

12 CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Some words about CLEF QA (2) The results for English-French using module from an external partner (ES) decrease from 39,5 % last year to 32 % this year, why ? Extensions of acronyms last year : 23 ( 1, 2, 17, 19, 34, 37, 63, 67, 77, 93, 101, 117, 121, 125, 141, 144, 150, 153, 154, 160, 184, 186, 190 ) and 4 this year (28, 95, 129, 145) We need similar questions each year and for each language.

13 END Thank you ! CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT


Download ppt "CLEF QA, September 21, 2006, Synapse Développement, D. LAURENT Why not 100% ?"

Similar presentations


Ads by Google