GOOD, MULTILINGUAL interpretation, translation, resources What can we do for the OG-08? Christian BOITET GETA, CLIPS, IMAG-campus UJF & CNRS, Grenoble, France
28/10/03G. Fafiotte & Ch. Boitet 2 2 applications, 1 common resource Communication in 2 languages Multilingual dissemination of information Building multilingual lexical resources
28/10/03G. Fafiotte & Ch. Boitet 3 Fully automatic HQ all-purpose Speech Translation …is a dream! ( as underlined by Prof. Feng ZhiWei) even more than FAHQMT of any text up to 30% "clarification" turns (Oviatt & Cohen91) For QUALITY bilingual communication 2 complementary feature sets, to be introduced in order: Interpretation Aids Immediately applicable (as Translation Aids!) But need humans, even if through the net Human control/participation Feedbacks & control (reformulate if SR or reverse MT bad…) Correction & interactive disambiguation
28/10/03G. Fafiotte & Ch. Boitet 4 1. Communication in 2 languages All-purpose multilingual communication human interpreters (volunteer students?) all languages! System integrating machine helps dictionaries, SR for better understanding management (human resources, rendez-vous…) up to automatic speech translation task-related: 100% automatic, WITH feedback, control all-purpose: with some interactive disambiguation
28/10/03G. Fafiotte & Ch. Boitet 5 Scenario 1 2 persons meet… ex: French visitor and Chinese soccer fan manage to say something in English, but They connect to the IBI service IBI = intermittent Beijing interpretation using 1 or 2 PDA (or cell-phone, or HH-PC…) If Chinese is not one of the 2 languages simply use 2 human interpreters (L1-C-C-L2)
28/10/03G. Fafiotte & Ch. Boitet 6 Scenario 1 2 persons meet… ex: French visitor and Chinese soccer fan manage to say something in English, but They connect to the IBI service IBI = intermittent Beijing interpretation using 1 or 2 PDA (or cell-phone, or HH-PC…) If Chinese is not one of the 2 languages simply use 2 human interpreters (L1-C-C-L2)
28/10/03G. Fafiotte & Ch. Boitet 7 Scenario 2 2 persons meet… ex: American visitor and Chinese soccer fan use ST & other machine aids to converse connect to IBI when ST breaks down :-( interpreter looks at dialog log, then helps using on-the-fly built dialog dictionary and may leave them again if s/he is needed by others and they say OK Then they continue on their own.
28/10/03G. Fafiotte & Ch. Boitet 8 Scenario 2 2 persons meet… ex: American visitor and Chinese soccer fan use ST & other machine aids to converse connect to IBI when ST breaks down :-( interpreter looks at dialog log, then helps using on-the-fly built dialog dictionary and may leave them again if s/he is needed by others and they say OK Then they continue on their own.
28/10/03G. Fafiotte & Ch. Boitet 9 Scenario 3 Afterwards, from hotel & from home… they have a distant conversation using their PC + headset [+ videocam] using a multimodal, multilingual chat no human interpreter, but user involvement control, feedback, interactive disambiguation
28/10/03G. Fafiotte & Ch. Boitet 10 Scénarios for ERiM-Interp (schéma 1) Communication and Collection servers Speaker 1Speaker 2 Professional, or casual situation (recorded if agreement) On-line oral translation (recorded if agreement) ERIM-i platform distant ‘intermittent’ i nterpreting … professional interpreter / volunteer interpreter
28/10/03G. Fafiotte & Ch. Boitet 11 Scénarios for ERiM-Interp (schéma 7) Communication and Collection servers Speaker 1 Professional, or casual situation (recorded if agreement) On-line oral translation (recorded if agreement) ERIM-i platform distant ‘intermittent’ i nterpreting … NEW SCENARIOS : a foreign researcher or University colleague, available for intermittent volunteer interpreting, from his/her usual workstation (office) a professional interpreter working home or in an office a multilingual multimodal ‘hot-webline’ for medical, technical, juridical… assistance a ‘junior’ interpreter (volunteer) … cf. user evaluation / demand Speaker 2 professional interpreter / volunteer interpreter
28/10/03G. Fafiotte & Ch. Boitet Dissemination of information Supposing situation is: From Chinese & English to other languages Written source, general + many domains High quality needed Then: FAHQMT is still a dream! Only practical way = disambiguation in source preedition (annotation) interactive disambiguation
28/10/03G. Fafiotte & Ch. Boitet 13 How? 2 possible strategies Cooperate with MT vendors (not 1!) Build a system from research efforts Technical aspects "Split" system & introduce interactive disambiguation after analysis "UNL++" as practical linguistico-semantic pivot Too sophisticated linterlinguas may be better, but are usable only by very very developers (if any)
28/10/03G. Fafiotte & Ch. Boitet Lexical resources Put what you have in Papillon site in the "contributed dictionaries" format basically, it is ANY XML form Transform into "Papillon format" (refine the ore) DiCo-NL for monolingual entries Add info to lexies (word senses) & refine them Connect lexies (through "axies") Generate application-oriented lexicons on the fly (real time server mode) or off-line no IPR problems here (Linux way, GPL)
Cooperative Building of a Large, Rich Multilingual Lexical Database of Monolingual Dictionaries & Interlingual Links to generate open source dictionaries on the Web Christian Boitet & Emmanuel Planas GETA, CLIPS, IMAG, CNRS, INPG & UJF Grenoble, France {Christian.Boitet, PAPILLON:
28/10/03G. Fafiotte & Ch. Boitet 16 French Dictionary Interlingual Dictionary Japanese Dictionary Vocable Carte n.f. Lexie carte à jouer Lexie carte géographique 地図 カード Acception 343 UNL: card(icl>play) Acception 345 UNL: map(fld>geography) Internal Architecture of the Database Architecture Derived from Dr. Gilles Sérasset’s Ph.D. Thesis
28/10/03G. Fafiotte & Ch. Boitet 17 Interlingual links motivated by translations = "AXIEs" Possibilitity to link 1 lexie to >1 acception Links to other representations: AXIE—1——n—>UW PAPILLON diagram French DiCo Vocable carte n.f. lexie carte.1 carte à jouer lexie carte.2 carte géographique Japanese DiCo 地図 カード Acception 343 UNL: card(icl>play), card(icl>thing)… Acception 345 UNL: map(fld>geography) Interlingual links Acception 1002 UNL: card(fld>money) a Thai DiCo English DiCo Vocable card N lexie card.1 playing card lexie card.2 money card Vocable=lexie map