1 Alignment Entropy as an Automated Predictor of Bitext Fidelity for Statistical Machine Translation Shankar Ananthakrishnan Rohit Prasad Prem Natarajan.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation IBM Model 1 CS626/CS460 Anoop Kunchukuttan Under the guidance of Prof. Pushpak Bhattacharyya.
MT Evaluation: Human Measures and Assessment Methods : Machine Translation Alon Lavie February 23, 2011.
Measures of Coincidence Vasileios Hatzivassiloglou University of Texas at Dallas.
Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.
DP-based Search Algorithms for Statistical Machine Translation My name: Mauricio Zuluaga Based on “Christoph Tillmann Presentation” and “ Word Reordering.
Re-ranking for NP-Chunking: Maximum-Entropy Framework By: Mona Vajihollahi.
Hybridity in MT: Experiments on the Europarl Corpus Declan Groves 24 th May, NCLT Seminar Series 2006.
Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.
June 2004 D ARPA TIDES MT Workshop Measuring Confidence Intervals for MT Evaluation Metrics Ying Zhang Stephan Vogel Language Technologies Institute Carnegie.
Discriminative Learning of Extraction Sets for Machine Translation John DeNero and Dan Klein UC Berkeley TexPoint fonts used in EMF. Read the TexPoint.
Orange: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation Chin-Yew Lin & Franz Josef Och (presented by Bilmes) or Orange: a.
ACL 2005 WORKSHOP ON BUILDING AND USING PARALLEL TEXTS (WPT-05), Ann Arbor, MI. June Competitive Grouping in Integrated Segmentation and Alignment.
1 Language Model Adaptation in Machine Translation from Speech Ivan Bulyko, Spyros Matsoukas, Richard Schwartz, Long Nguyen, and John Makhoul.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
A Hierarchical Phrase-Based Model for Statistical Machine Translation Author: David Chiang Presented by Achim Ruopp Formulas/illustrations/numbers extracted.
9/12/2003LTI Student Research Symposium1 An Integrated Phrase Segmentation/Alignment Algorithm for Statistical Machine Translation Joy Advisor: Stephan.
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.
The CMU-UKA Statistical Machine Translation Systems for IWSLT 2007 Ian Lane, Andreas Zollmann, Thuy Linh Nguyen, Nguyen Bach, Ashish Venugopal, Stephan.
Matthew Snover (UMD) Bonnie Dorr (UMD) Richard Schwartz (BBN) Linnea Micciulla (BBN) John Makhoul (BBN) Study of Translation Edit Rate with Targeted Human.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
Achieving Domain Specificity in SMT without Over Siloing William Lewis, Chris Wendt, David Bullock Microsoft Research Machine Translation.
Direct Translation Approaches: Statistical Machine Translation
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
METEOR-Ranking & M-BLEU: Flexible Matching & Parameter Tuning for MT Evaluation Alon Lavie and Abhaya Agarwal Language Technologies Institute Carnegie.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
Statistical Machine Translation Part IV – Log-Linear Models Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Active Learning for Statistical Phrase-based Machine Translation Gholamreza Haffari Joint work with: Maxim Roy, Anoop Sarkar Simon Fraser University NAACL.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
1 Semi-Supervised Approaches for Learning to Parse Natural Languages Rebecca Hwa
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
Korea Maritime and Ocean University NLP Jung Tae LEE
Object Recognition a Machine Translation Learning a Lexicon for a Fixed Image Vocabulary Miriam Miklofsky.
1 Statistical Machine Translation Models for Personalized Search Rohini U AOL India R&D, Bangalore India Vamshi Ambati Language.
Presenter: Jinhua Du ( 杜金华 ) Xi’an University of Technology 西安理工大学 NLP&CC, Chongqing, Nov , 2013 Discriminative Latent Variable Based Classifier.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
(Statistical) Approaches to Word Alignment
A New Approach for English- Chinese Named Entity Alignment Donghui Feng Yayuan Lv Ming Zhou USC MSR Asia EMNLP-04.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.
Towards Syntactically Constrained Statistical Word Alignment Greg Hanneman : Advanced Machine Translation Seminar April 30, 2008.
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Getting the structure right for word alignment: LEAF Alexander Fraser and Daniel Marcu Presenter Qin Gao.
Confidence Measures As a Search Guide In Speech Recognition Sherif Abdou, Michael Scordilis Department of Electrical and Computer Engineering, University.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Neural Machine Translation
Statistical Machine Translation Part II: Word Alignments and EM
Monoligual Semantic Text Alignment and its Applications in Machine Translation Alon Lavie March 29, 2012.
Suggestions for Class Projects
Statistical NLP: Lecture 9
Vamshi Ambati 14 Sept 2007 Student Research Symposium
Improved Word Alignments Using the Web as a Corpus
Statistical Machine Translation Papers from COLING 2004
Machine Translation(MT)
Improving IBM Word-Alignment Model 1(Robert C. MOORE)
Statistical NLP Spring 2011
Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.
Statistical NLP : Lecture 9 Word Sense Disambiguation
Presentation transcript:

1 Alignment Entropy as an Automated Predictor of Bitext Fidelity for Statistical Machine Translation Shankar Ananthakrishnan Rohit Prasad Prem Natarajan Speech and Language Processing Unit BBN Technologies Cambridge, MA

2 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

3 Statistical machine translation (SMT) Start with a large bitext Parallel corpora or “sentence pairs” Lots (thousands/millions) of translation pairs! Align sentence pairs at the word level Extract phrase pairs or translation rules Constrained by word alignments Decode source with extracted phrases/rules In conjunction with a language model

4 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

5 Word alignment Link corresponding words in sentence pairs Forms basis of almost all SMT architectures Statistical word alignment [Brown93, Vogel96] Probabilistic noisy-channel-based translation model T m Estimated using expectation-maximization (EM) Choose most likely (Viterbi) alignment A v NULL tjAr bDAEp commodity traders

6 Word alignment quality Errors in alignment are caused by Data sparsity (low-resource languages) Translation errors Paraphrasing, non-literal translations Alignment errors affect translation quality [Fraser07] Correcting or discarding bad alignments may help How do we identify poorly aligned constituents? Need automated alignment quality metric Unsupervised: no manual intervention Correlates with supervised measures (e.g. AER) Scales up from the word- to the corpus-level

7 An obvious candidate metric Length-normalized Viterbi alignment score Monotonic function of p( A v | T m ) By-product of alignment process Benefits Readily available unsupervised metric Intuitive, easy to understand Drawbacks A low probability alignment need not be incorrect Poor granularity: only sentence-level alignment quality

8 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

9 Alignment entropy Uncertainty of a link in the Viterbi alignment Higher uncertainty implies poorer alignment? Basis for automated alignment quality metric Need a probability distribution over alignments Different contexts for a given sentence pair Estimate a multinomial distribution over word alignments Bootstrapping simulates different contexts Resample original bitext with replacement

10 Defining alignment entropy j th word of i th source sentence iterate over all target words (including NULL) index of target word to which f ij is aligned = 1 iff f ij is aligned to e ik in the l-th bag = 0 otherwise { set of resampled bitexts in which the i th sentence pair occurs

11 Evaluating alignment entropy

12 Notes on alignment entropy Measures variability of alignments across bags Defined only for IBM model alignments Each source word linked to exactly one target word Unidirectional: defined for source-target links Reverse alignment for target-source alignment entropy Combine the two for bidirectional alignment entropy Sentence-pair specific Not fixed for a given source vocabulary word Defined for each source word in every sentence pair

13 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

14 Alignment error analysis IBM-4 alignments using GIZA++ [Al-Onaizan99] English/Arabic: 129,126 pairs (ca. 1.5M words) 100 training contexts (1 original, 99 resampled) Bidirectional sentence-level alignment entropy Bin into (H)ighest, (L)ow, and (Z)ero entropy sets Select ca. 250 sentence pairs from each set Length-normalized Viterbi alignment score Pool sentence pair sets selected above Re-rank by normalized Viterbi alignment score Pick ca. 250 pairs with worst scores (A) Gold-standard manual alignments for each set Precision, recall, AER, balanced F-measure

15 Alignment error analysis MeasureZ-setL-setH-setA-set Align. Ent Precision 94.3%82.7%55.0%61.8% Recall 82.3%73.0%54.1%61.0% AER 12.0%22.2%45.4%38.6% Balanced F 87.9%77.6%54.5%61.4% Table 1 Alignment entropy vs. alignment quality

16 Notes on alignment error analysis Results support our hypothesis Higher alignment entropy indicates poorer quality Superior to normalized Viterbi alignment score AER(H-set) > AER(A-set) by 6.8% absolute

17 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

18 Bitext translation quality Human translations often contain errors Non-native speakers of one language Some constructs difficult to translate (e.g. idioms) Oversight, inadequate quality control Predicting problems in human translations Semantic errors, missing chunks, etc. Non-literality (paraphrasing) Use alignment entropy to identify problems? Is it correlated with translation quality?

19 Measuring bitext translation quality TER/HTER analysis of existing translations [Snover06] Against carefully prepared gold-standard translations Translation Edit Rate (TER) # insertions, deletions, substitutions, and shifts Lexically-based, no notion of semantic equivalence Human-targeted Translation Edit Rate (HTER) Human expert produces targeted references Minimally edit hypotheses for semantic equivalence to untargeted gold-standard references HTER = TER evaluated against targeted references Minimizes impact of lexical choice

20 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

21 Translation quality analysis Existing translations are the “hypotheses” New gold-standard references for H-, L-, and Z-sets Minimal paraphrasing, as literal as possible Thoroughly checked for quality Evaluate TER between “hypotheses” and gold-standard Measure of translation literality HTER evaluation Targeted references from hypotheses and gold-standard HTER = TER of hypotheses w.r.t. targeted references Measure of semantic translation correctness

22 Translation quality analysis Eval setAERTERHTER Z-set 12.0%23.7%1.7% L-set 22.2%48.3%0.9% H-set 45.4%63.0%8.6% Table 2 Alignment entropy vs. translation quality

23 Notes on translation quality analysis Predicting translation literality Higher alignment entropy produces higher TER Indicative of paraphrasing Semantic correctness of translation pairs Excellent equivalence in zero/low-entropy pairs Significant errors in highest alignment entropy pairs

24 Talk progress Statistical machine translation Word alignment Alignment entropy Alignment error analysis Bitext translation quality Translation quality analysis Conclusion and future directions

25 Conclusion Excellent predictor of alignment quality Fine-grained, extensible, word-level measure Superior to normalized Viterbi alignment score Serves as measure of translation literality Identifies translation pairs with gross errors Useful tool for validating human translations

26 Future directions Bootstrapped phrase confidence for SMT [ongoing] Consistency of phrase pairs across resampled bitexts Integrated as a phrase level feature (tuned with MERT) Modest BLEU improvements ( point) Online human translation validation [planned] Identify potential translation errors on the fly Assist human translators for rapid SMT development Enriched machine translation [planned] Project features across high-confidence alignments Availability of a fine-grained measure is key

27 References [Brown93] Peter E. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics, 19: [Vogel96] Stephan Vogel, Hermann Ney, and Christoph Tillmann HMM-based Word Alignment in Statistical Translation. In Proceedings of the 16 th conference on Computational Linguistics, pp , Morristown, NJ. [Fraser07] Alexander Fraser and Daniel Marcu Measuring Word Alignment Quality for Statistical Machine Translation. Computational Linguistics, 33(3): [Al-Onaizan99] Yaser Al-Onaizan, Jan Curin, Michael Jahr, Kevin Knight, John Lafferty, Dan Melamed, Franz Josef Och, David Purdy, Noah A. Smith, and David Yarowsky Statistical Machine Translation: Final Report. Technical Report, JHU Summer Workshop. [Snover06] Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul A Study of Translation Edit Rate with Targeted Human Annotation. In Proceedings AMTA, pp

28 Thank you!

29 Supervised alignment quality Annotate “sure” (S) and “possible” (P) links Evaluate against hypothesis alignment A