Download presentation
Presentation is loading. Please wait.
1
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from the paper. Critique and snarky remarks, however, are original.
2
Motivation We have a new way of learning phrase translations, but… “What is the best method to extract phrase translation pairs?”
3
What to do, what to do Compose a framework for consistent comparison Implement each algorithm Compare the results
4
Evaluation Framework Phrases Models involved Language model Statistical model for translation Distortion model Decoder
5
Evaluation Framework: Phrases We all know what phrases are right? NP, VP, wait, what? Oh. Here, they’re generic spanning and non- overlapping subsequences of words. Are these guys really linguists?
6
Evaluation Framework: Models Language Model Trigram usually p(e n |e n-1,e n-2 ) Translation Model Argmax e p(e|f) = argmax e p(f|e)p(e) e best = argmax e p(f|e)p LM (e)ω length(e) p(f|e) is decomposed into
7
Evaluation Framework: Models Distortion Model d(a i – b i-1 ) a i = start position of the foreign phrase translated into the i th English phrase b i-1 = end position of the foreign phrase translated into the (i-1) th English phrase Learned from the joint probability model Ping told us about
8
Evaluation Framework: Decoder Left-to-right incremental Stack-based beam search Estimates future costs Same decoder used in all experiments
9
Baseline Experiments Word-based alignment Syntactic phrases Phrase alignments
10
Baseline: Word-based Alignment Learn the phrases from word alignments
11
Baseline: Syntactic Phrases Learn only syntactically correct phrases Start with the word based alignment Prune out the phrase pairs which aren’t subtrees in the parsed sentences for either language.
12
Baseline: Phrase Alignment Marcu and Wong, 2002 Yes, this is the paper Ping just presented.
13
Experiment Background Europarl and BLEU Training corpus of 10, 20, 40, 80,160 and 320 kilo-sentence pairs
14
Baseline Results Notice the bottom row there? Comparing these models is like taking a 5-year old to a chess tournament.
15
Baseline Results
16
More Experiments Weighting Syntactic Phrases Maximum Phrase Length Lexical Weighting Phrase Extraction Heuristic Simpler Underlying Word-Base Models Other Languages
17
Experiments and Results Weighting Syntactic Phrases Double the count on syntactic phrases Is that sufficient? Insufficient post-analysis on this one The same BLEU score Were the translations in better syntax? Did the translations at least use more syntactic phrases?
18
Experiments and Results Maximum Phrase Length
19
Experiments and Results Lexical Weighting Lexical probability distribution Lexical Weight
20
Experiments and Results Lexical Weighting Example
21
Experiments and Results Lexical Weighting With multiple alignments Extended to fit this model
22
Experiments and Results Lexical Weighting Improvement:.01 BLEU
23
Experiments and Results Phrase Extraction Heuristic Align Bidirectionally Note this gives two different word alignment sets Start with the intersection of the two sets Add possible alignments Only if they’re in the union of the sets Only if they connect at least one previously unaligned word
24
Experiments and Results Phrase Extraction Heuristic Algorithm Start with the first English word Expand only directly adjacent alignment points Move to the next English word, repeat. Finally add non-adjacent alignment points which meet the heuristic criteria.
25
Experiments and Results
26
Simpler Underlying Word-Base Models IBM models 1-4
27
Experiments and Results Other Languages
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.