Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau.

Similar presentations


Presentation on theme: "Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau."— Presentation transcript:

1 Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau

2 Overview Language Alignment System Datasets  Sentence-aligned sets for training (ex. The Hansards Corpus, European Parliamentary Proceedings Parallel Corpus)  A word-aligned set for testing and evaluation to measure accuracy and precision Decoding

3 Language Alignment Goal: Produce a word-aligned set from a sentence-aligned dataset First step on the road toward Statistical Machine Translation Example Problem:  The motion to adjourn the House is now deemed to have been adopted.  La motion portant que la Chambre s'ajourne maintenant est réputée adoptée.

4 IBM Models 1 and 2 -Kevin Knight, A Statistical MT Tutorial Workbook, 1999 Each capable of being used to produce a word-aligned dataset separately. EM Algorithm Model 1 produces T-values based on normalized fractional counting of corresponding words. Additionally, Model 2 uses A-values for “reverse distortion probabilities” – probabilities based on the positions of the words

5 Training Data European Parliament Proceedings Parallel Corpus 1996-2003 Aligned Languages:  English - French  English - Dutch  English - Italian  English - Finish  English - Portuguese  English - Spanish  English - Greek

6 Training Data cont. Eliminated  Misaligned sentences  Sentences with 50 or more words  XML tags  Symbols and numerical characters other then commas and periods

7 Ideally… http://www.cs.berkeley.edu/~klein/cs294-5

8 Bypassing Interlingua: Models I-III Variables contributing to the probability of a sentence:  Correlation between words in the source/target languages  Fertility of a word  Correlation between order of words in source sentence and order of words in target

9 A Translation Matrix RobCatisDog Rob1000 Gato0100 es00.50 esta00.50 Perro0001

10 Building the Translation Matrix: Starting from alignments Find the sentence alignment If a word in the source aligns with a word in the target, then increment the translation matrix. Normalize the translation matrix

11 Can’t find alignments Most sentences in the hansards corpus are 60 words long. There are many that can be over 100. 100 100 possible alignments

12 Counting Rob is a boy. Rob es nino. Rob is tall.Rob es alto. Eric is tall.Eric es alto. … … Base counts on co-occurrence, weighting based on sentence length.

13 Iterative Convergence Use Estimation Maximization algorithm Creates translation matrix RobIsTallboy Rob.66.33.25 es.30.66.25 alto.2.05.50 nino.2.050.5

14 Distorting the Sentence Word order changes between languages How is a sentence with 2 words distorted? How is a sentence with 3 words distorted? How is a sentence with … To keep track of this information we use…

15 A tesseract! (A quadruply nested default dictionary) This could be a problem if there are more than 100 words in a sentence. 100x100x100x100 = too big for RAM and takes too much time

16 Broad Look at MT “The translation process can be described simply as: 1.Decoding the meaning of the source text, and 2.Re-encoding this meaning in the target language.” - “Translation Process”, Wikipedia, May 2006

17 Decoding How to go from the T-matrix and A-matrix to a word alignment? There are several approaches…

18 Viterbi If only doing alignment, much smaller memory and time requirements. Returns optimal path. T-Matrix probabilities function as the “emission” matrix A-Matrix probabilities concerned with the positioning of words

19 Decoding as a Translator Without supplying a translated sentence to the program, it is capable of being a stand-alone translator instead of a word aligner. However, while the Viterbi algorithm runs quickly with pruning for decoding, for translating the run time skyrockets.

20 Greedy Hill Climbing Knight & Koehn, What’s New in Statistical Machine Translation, 2003 Best first search 2-step look ahead to avoid getting stuck in most probable local maxima

21 Beam Search Knight & Koehn, What’s New in Statistical Machine Translation, 2003 Optimization of Best First Search with heuristics and “beam” of choices Exponential tradeoff when increasing the “beam” width

22 Other Decoding Methods Knight & Koehn, What’s New in Statistical Machine Translation, 2003 Finite State Transducer  Mapping between languages based on a finite automaton Parsing  String to Tree Model

23 Problem: One to Many Necessary to take all alignments over a certain probability in order to capture the “probability that e has fertility at least a given value” Al-Onaizan, Curin, Jahr, etc., Statistical Machine Translation, 1999

24 Results Study done in 2003 on word alignment error rates in Hansards corpus:  Model 2 – 29.3% on 8K training sentence pairs 19.5% on 1.47M training sentence pairs  Optimized Model 6 – 20.3% on 8K training sentence pairs 8.7% on 1.47M training sentence pairs Och and Ney, A Systematic Comparison of Various Statistical Alignment Models, 2003

25 Expected Accuracy 70% overall Language performance:  Dutch French Italian, Spanish, Portuguese  Greek  Finish

26 Possible Future Work Given more time, we would’ve implemented IBM Model 3 Additionally uses n, p, and d fertilities for weighted alignments:  N, number of words produced by one word  D, distortion  P, parameter involving words that aren’t involved directly Invokes Model 2 for scoring

27 Another Possible Translation Scheme Example-Based Machine Translation  Translation-by-Analogy  Can sometimes achieve better than the “gist” translations from other models

28 Why Is Improving Machine Translation Necessary?

29 A Chinese to English Translation

30 The End Are there any questions/comments?

31


Download ppt "Machine Translation A Presentation by: Julie Conlonova, Rob Chase, and Eric Pomerleau."

Similar presentations


Ads by Google