Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.

Similar presentations


Presentation on theme: "Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from."— Presentation transcript:

1 Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from the paper. Critique and snarky remarks, however, are original.

2 Motivation We have a new way of learning phrase translations, but… “What is the best method to extract phrase translation pairs?”

3 What to do, what to do Compose a framework for consistent comparison Implement each algorithm Compare the results

4 Evaluation Framework Phrases Models involved Language model Statistical model for translation Distortion model Decoder

5 Evaluation Framework: Phrases We all know what phrases are right? NP, VP, wait, what? Oh. Here, they’re generic spanning and non- overlapping subsequences of words. Are these guys really linguists?

6 Evaluation Framework: Models Language Model Trigram usually p(e n |e n-1,e n-2 ) Translation Model Argmax e p(e|f) = argmax e p(f|e)p(e) e best = argmax e p(f|e)p LM (e)ω length(e) p(f|e) is decomposed into

7 Evaluation Framework: Models Distortion Model d(a i – b i-1 ) a i = start position of the foreign phrase translated into the i th English phrase b i-1 = end position of the foreign phrase translated into the (i-1) th English phrase Learned from the joint probability model Ping told us about

8 Evaluation Framework: Decoder Left-to-right incremental Stack-based beam search Estimates future costs Same decoder used in all experiments

9 Baseline Experiments Word-based alignment Syntactic phrases Phrase alignments

10 Baseline: Word-based Alignment Learn the phrases from word alignments

11 Baseline: Syntactic Phrases Learn only syntactically correct phrases Start with the word based alignment Prune out the phrase pairs which aren’t subtrees in the parsed sentences for either language.

12 Baseline: Phrase Alignment Marcu and Wong, 2002 Yes, this is the paper Ping just presented.

13 Experiment Background Europarl and BLEU Training corpus of 10, 20, 40, 80,160 and 320 kilo-sentence pairs

14 Baseline Results Notice the bottom row there? Comparing these models is like taking a 5-year old to a chess tournament.

15 Baseline Results

16 More Experiments Weighting Syntactic Phrases Maximum Phrase Length Lexical Weighting Phrase Extraction Heuristic Simpler Underlying Word-Base Models Other Languages

17 Experiments and Results Weighting Syntactic Phrases Double the count on syntactic phrases Is that sufficient? Insufficient post-analysis on this one The same BLEU score Were the translations in better syntax? Did the translations at least use more syntactic phrases?

18 Experiments and Results Maximum Phrase Length

19 Experiments and Results Lexical Weighting Lexical probability distribution Lexical Weight

20 Experiments and Results Lexical Weighting Example

21 Experiments and Results Lexical Weighting With multiple alignments Extended to fit this model

22 Experiments and Results Lexical Weighting Improvement:.01 BLEU

23 Experiments and Results Phrase Extraction Heuristic Align Bidirectionally Note this gives two different word alignment sets Start with the intersection of the two sets Add possible alignments Only if they’re in the union of the sets Only if they connect at least one previously unaligned word

24 Experiments and Results Phrase Extraction Heuristic Algorithm Start with the first English word Expand only directly adjacent alignment points Move to the next English word, repeat. Finally add non-adjacent alignment points which meet the heuristic criteria.

25 Experiments and Results

26 Simpler Underlying Word-Base Models IBM models 1-4

27 Experiments and Results Other Languages


Download ppt "Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from."

Similar presentations


Ads by Google