Presentation is loading. Please wait.

Presentation is loading. Please wait.

MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.

Similar presentations


Presentation on theme: "MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee."— Presentation transcript:

1 MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee

2 March 10, 2005MEMT2 Goals and Approach Combine the output of multiple MT engines into a synthetic output that outperforms the originals in translation quality Synthetic combination of the originals, NOT selecting the best system Experimented with two approaches: Approach-1: Merging of Lattice outputs + joint decoding –Each MT system produces a lattice of translation fragments, indexed based on source word positions –Lattices are merged into a single common lattice –Statistical MT decoder selects a translation “path” through the lattice Approach-2: Align best output from engines + new decoder –Each MT system produces a sentence translation output –Establish an explicit word matching between all words of the various MT engine outputs –“Decoding”: create a collection of synthetic combinations of the original strings based on matched words, target LM, and constraints + re-combination and pruning –Score resulting hypotheses and select a final output

3 March 10, 2005MEMT3 Approach-2: Sentence MEMT Idea: –Start with output sentences of the various MT engines –Explicitly align the words that are common between any pair of systems, and apply transitivity –Use the alignments as reinforcement and as indicators of possible locations for the words –Each engine has a “weight” that is used for the words that it contributes –Decoder searches for an optimal synthetic combination of words and phrases that optimizes a scoring function that combines the alignment weights and a LM score

4 March 10, 2005MEMT4 The Sentence Matcher Developed by Satanjeev Banerjee as a component in our METEOR Automatic MT Evaluation metric Finds maximal alignment match with minimal “crossing branches” Implementation: Clever search algorithm for best match using pruning of sub-optimal sub-solutions

5 March 10, 2005MEMT5 Matcher Example IBM: the sri lankan prime minister criticizes head of the country's ISI: The President of the Sri Lankan Prime Minister Criticized the President of the Country CMU: Lankan Prime Minister criticizes her country

6 March 10, 2005MEMT6 The MEMT Algorithm Algorithm builds collections of partial hypotheses of increasing length Partial hypotheses are extended by selecting the “next available” word from one of the original systems Sentences are assumed synchronous: –Each word is either aligned with another word or is an alternative of another word Extending a partial hypothesis with a word “pulls” and “uses” its aligned words with it, and marks its alternatives as “used” – “vectors” keep track of this Partial hypotheses are scored and ranked Pruning and re-combination Hypothesis can end if any original system proposes an end of sentence as next word

7 March 10, 2005MEMT7 The MEMT Algorithm Scoring: –Alignment score based on reinforcement from alignments of the words –LM score based on trigram LM –Sum logs of alignment score and LM score (equivalent to product of probabilities) –Select best scoring hypothesis based on: Total score (bias towards shorter hypotheses) Average score per word

8 March 10, 2005MEMT8 The MEMT Algorithm Parameters: –“lingering word” horizon: how long is a word allowed to linger when words following it have already been used? –“lookahead” horizon: how far ahead can we look for an alternative for a word that is not aligned? –“POS matching”: limit search for an alternative to only words of the same POS

9 March 10, 2005MEMT9 Example IBM: korea stands ready to allow visits to verify that it does not manufacture nuclear weapons 0.7407 ISI: North Korea Is Prepared to Allow Washington to Verify that It Does Not Make Nuclear Weapons 0.8007 CMU: North Korea prepared to allow Washington to the verification of that is to manufacture nuclear weapons 0.7668 Selected MEMT Sentence : north korea is prepared to allow washington to verify that it does not manufacture nuclear weapons. 0.8894 (-2.75135)

10 March 10, 2005MEMT10 Example IBM: victims russians are one man and his wife and abusing their eight year old daughter plus a ( 11 and 7 years ) man and his wife and driver, egyptian nationality. : 0.6327 ISI: The victims were Russian man and his wife, daughter of the most from the age of eight years in addition to the young girls ) 11 7 years ( and a man and his wife and the bus driver Egyptian nationality. : 0.7054 CMU: the victims Cruz man who wife and daughter both critical of the eight years old addition to two Orient ( 11 ) 7 years ) woman, wife of bus drivers Egyptian nationality. : 0.5293 MEMT Sentence : Selected : the victims were russian man and his wife and daughter of the eight years from the age of a 11 and 7 years in addition to man and his wife and bus drivers egyptian nationality. 0.7647 -3.25376 Oracle : the victims were russian man and wife and his daughter of the eight years old from the age of a 11 and 7 years in addition to the man and his wife and bus drivers egyptian nationality young girls. 0.7964 -3.44128

11 March 10, 2005MEMT11 Example IBM: the sri lankan prime minister criticizes head of the country's : 0.8862 ISI: The President of the Sri Lankan Prime Minister Criticized the President of the Country : 0.8660 CMU: Lankan Prime Minister criticizes her country: 0.6615 MEMT Sentence : Selected: the sri lankan prime minister criticizes president of the country. 0.9353 -3.27483 Oracle: the sri lankan prime minister criticizes president of the country's. 0.9767 -3.75805

12 March 10, 2005MEMT12 Current System Some features of decoding algorithm and final scoring still under experimentation Initial development tests performed on TIDES 2003 Arabic-to-English MT data, using IBM, ISI and CMU SMT system output Further development tests performed on Arabic-to-English EBMT Apptek and SYSTRAN system output and on three Chinese-to- English COTS systems Integrated within CACI REFLEX Demonstration Platform

13 March 10, 2005MEMT13 Experimental Results: Chinese-to-English SystemMETEOR Score Online Translator A.4917 Online Translator B.4859 Online Translator C.4910 Choosing best online translation.5381 MEMT.5301 Best hypothesis generated by MEMT.5840

14 March 10, 2005MEMT14 Experimental Results: Arabic-to-English SystemMETEOR Score Apptek.4241 EBMT.4231 Systran.4405 Choosing best online translation.4432 MEMT.5185 Best hypothesis generated by MEMT.5883

15 March 10, 2005MEMT15 Other Examples http://www-2.cs.cmu.edu/afs/cs/user/alavie/Students/Shyam/Comps100

16 March 10, 2005MEMT16 Conclusions New sentence-level MEMT approach with promising performance Easy to run on both research and COTS systems Tuning of parameter space for hypothesis generation – too tuned to METEOR? Decoding is still suboptimal –Oracle scores show there is much room for improvement –Need for additional discriminant features –Some ideas currently under investigation

17 March 10, 2005MEMT17 Approach-1: Lattice MEMT Approach: –Multiple MT systems produce a lattice of output segments –Create a “union” lattice of the various systems –Decode the joint lattice and select best synthetic output

18 March 10, 2005MEMT18 Approach-1: Lattice MEMT Lattice Decoder from CMU’s SMT: –Lattice arcs are scored uniformly using word-to-word translation probabilities, regardless of which engine produced the arc –Decoder searches for path that optimizes combination of Translation Model score and Language Model score –Decoder can also reorder words or phrases (up to 4 positions ahead)

19 March 10, 2005MEMT19 Initial Experiment: Hindi-to-English Systems Put together a scenario with “miserly” data resources: –Elicited Data corpus: 17589 phrases –Cleaned portion (top 12%) of LDC dictionary: ~2725 Hindi words (23612 translation pairs) –Manually acquired resources during the DARPA SLE: 500 manual bigram translations 72 manually written phrase transfer rules 105 manually written postposition rules 48 manually written time expression rules No additional parallel text!!

20 March 10, 2005MEMT20 Initial Experiment: Hindi-to-English Systems Tested on section of JHU provided data: 258 sentences with four reference translations –SMT system (stand-alone) –EBMT system (stand-alone) –XFER system (naïve decoding) –XFER system with “strong” decoder No grammar rules (baseline) Manually developed grammar rules Automatically learned grammar rules –XFER+SMT with strong decoder (MEMT)

21 March 10, 2005MEMT21 Results on JHU Test Set (very miserly training data) SystemBLEUM-BLEUNIST EBMT0.0580.1654.22 SMT0.0930.1914.64 XFER (naïve) man grammar 0.0550.1774.46 XFER (strong) no grammar 0.1090.2245.29 XFER (strong) learned grammar 0.1160.2315.37 XFER (strong) man grammar 0.1350.2435.59 XFER+SMT0.1360.2435.65

22 March 10, 2005MEMT22 Effect of Reordering in the Decoder

23 March 10, 2005MEMT23 Further Experiments: Arabic-to-English Systems Combined: –CMU’s SMT system –CMU’s EBMT system –UMD rule-based system –(IBM didn’t work out) TM scores from CMU SMT system Built large new English LM Tested on TIDES 2003 Test set

24 March 10, 2005MEMT24 Arabic-to-English Systems Lattice MEMT Results: BLEUM-BLEUMETEOR UMD only.0335 [.0300,.0374].1099 [.1074,.1129].2356 [.2293,.3419] EBMT only.1090 [.1017,.1160].1861 [.1799,.1921].3666 [.3574,.3752] SMT only.2779 [.2782,.2886].3499 [.3412,.3582].5754 [.5649,.5855] EBMT+UMD.1206 [.1133,.1288].2069 [.2010,.2135].4061 [.3976,.4151] SMT+EBMT.2586 [.2477,.2702].3309 [.3222,.3403].5450 [.5360,.5545] SMT+UMD.2622 [.2519,.2724].3363 [.3281,.3446].5666 [.5575,.5764] SMT+UMD+ EBMT.2527 [.2426,.2640].3262 [.3181,.3349].5394 [.5290,.5504]

25 March 10, 2005MEMT25 Lattice MEMT Main Drawbacks: –Requires MT engines to provide lattice output  difficult to obtain! –Lattice output from all engines must be compatible: common indexing based on source word positions  difficult to standardize! –Common TM used for scoring edges may not work well for all engines –Decoding does not take into account any reinforcements from multiple engines proposing the same translation for any portion of the input

26 March 10, 2005MEMT26 Demonstration

27 March 10, 2005MEMT27 Experimental Results: Arabic-to-English SystemP/R/F1/Fmean Apptek.5137/.5336/.5235/.5316 EBMT.5710/.4781/.5204/.4860 Systran.4994/.5474/.5223/.5422 Choosing best online translation. MEMT.5383/.6212/.5768/.6118 Best hypothesis generated by MEMT.


Download ppt "MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee."

Similar presentations


Ads by Google