“Applying Morphology Generation Models to Machine Translation” By Kristina Toutanova, Hisami Suzuki, Achim Ruopp (Microsoft Research). UW Machine Translation.

Slides:



Advertisements
Similar presentations
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Advertisements

Finite-State Transducers Shallow Processing Techniques for NLP Ling570 October 10, 2011.
The University of Wisconsin-Madison Universal Morphological Analysis using Structured Nearest Neighbor Prediction Young-Bum Kim, João V. Graça, and Benjamin.
Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.
Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.
Stemming, tagging and chunking Text analysis short of parsing.
1 A Tree Sequence Alignment- based Tree-to-Tree Translation Model Authors: Min Zhang, Hongfei Jiang, Aiti Aw, et al. Reporter: 江欣倩 Professor: 陳嘉平.
Novel Reordering Approaches in Phrase-Based Statistical Machine Translation S. Kanthak, D. Vilar, E. Matusov, R. Zens & H. Ney ACL Workshop on Building.
Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.
1 Language Model Adaptation in Machine Translation from Speech Ivan Bulyko, Spyros Matsoukas, Richard Schwartz, Long Nguyen, and John Makhoul.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Text-To-Speech Synthesis An Overview. What is a TTS System  Goal A system that can read any text Automatic production of new sentences Not just audio.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Maximum Entropy Model LING 572 Fei Xia 02/08/07. Topics in LING 572 Easy: –kNN, Rocchio, DT, DL –Feature selection, binarization, system combination –Bagging.
Microsoft Research Faculty Summit Robert Moore Principal Researcher Microsoft Research.
Statistical Machine Translation Part X – Dealing with Morphology for Translating to German Alexander Fraser ICL, U. Heidelberg CIS, LMU München
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,
The CMU-UKA Statistical Machine Translation Systems for IWSLT 2007 Ian Lane, Andreas Zollmann, Thuy Linh Nguyen, Nguyen Bach, Ashish Venugopal, Stephan.
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.
March 24, 2005EARS STT Workshop1 A Study of Some Factors Impacting SuperARV Language Modeling Wen Wang 1 Andreas Stolcke 1 Mary P. Harper 2 1. Speech Technology.
Statistical Machine Translation Part IV – Log-Linear Models Alex Fraser Institute for Natural Language Processing University of Stuttgart Seminar:
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Querying Across Languages: A Dictionary-Based Approach to Multilingual Information Retrieval Doctorate Course Web Information Retrieval Speaker Gaia Trecarichi.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
Analysis of a Neural Language Model Eric Doi CS 152: Neural Networks Harvey Mudd College.
Statistical Machine Translation Part IV – Log-Linear Models Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Morphology & Machine Translation Eric Davis MT Seminar 02/06/08 Professor Alon Lavie Professor Stephan Vogel.
2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.
Morpho Challenge competition Evaluations and results Authors Mikko Kurimo Sami Virpioja Ville Turunen Krista Lagus.
1 University of Palestine Topics In CIS ITBS 3202 Ms. Eman Alajrami 2 nd Semester
Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.
Translating from Morphologically Complex Languages: A Paraphrase-Based Approach Preslav Nakov & Hwee Tou Ng.
Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.
Coşkun Mermer, Hamza Kaya, Mehmet Uğur Doğan National Research Institute of Electronics and Cryptology (UEKAE) The Scientific and Technological Research.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Real-World Semi-Supervised Learning of POS-Taggers for Low-Resource Languages Dan Garrette, Jason Mielens, and Jason Baldridge Proceedings of ACL 2013.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Ibrahim Badr, Rabih Zbib, James Glass. Introduction Experiment on English-to-Arabic SMT. Two domains: text news,spoken travel conv. Explore the effect.
Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.
Hebrew-to-English XFER MT Project - Update Alon Lavie June 2, 2004.
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.
Natural Language Processing Group Computer Sc. & Engg. Department JADAVPUR UNIVERSITY KOLKATA – , INDIA. Professor Sivaji Bandyopadhyay
Large Vocabulary Data Driven MT: New Developments in the CMU SMT System Stephan Vogel, Alex Waibel Work done in collaboration with: Ying Zhang, Alicia.
St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences Recurrent Neural Network-based Language Modeling for an Automatic.
CMU Statistical-XFER System Hybrid “rule-based”/statistical system Scaled up version of our XFER approach developed for low-resource languages Large-coverage.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Spring 2010 Lecture 4 Kristina Toutanova MSR & UW With slides borrowed from Philipp Koehn and Hwee Tou Ng LING 575: Seminar on statistical machine translation.
A CRF-BASED NAMED ENTITY RECOGNITION SYSTEM FOR TURKISH Information Extraction Project Reyyan Yeniterzi.
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.
Arnar Thor Jensson Koji Iwano Sadaoki Furui Tokyo Institute of Technology Development of a Speech Recognition System For Icelandic Using Machine Translated.
Approaches to Machine Translation
CSE 517 Natural Language Processing Winter 2015
Issues in Arabic MT Alex Fraser USC/ISI 9/22/2018 Issues in Arabic MT.
Morphological Segmentation Inside-Out
Approaches to Machine Translation
Word embeddings (continued)
Speaker Identification:
Attention for translation
A Joint Model of Orthography and Morphological Segmentation
Presentation transcript:

“Applying Morphology Generation Models to Machine Translation” By Kristina Toutanova, Hisami Suzuki, Achim Ruopp (Microsoft Research). UW Machine Translation Reading Group, 19 th May 2008

Meta-Motivation Machine Translation is a collection of subproblems: alignment (corpus, sentence, word/phrase), reordering, phrase extraction, language modeling, transliteration, capitalization, etc etc. It’s hard to work on just a sub-problem in Machine Translation and have those gains translate (har!) over to the overall system performance. A side goal is to work on independent, portable, modules in the MT system.

Motivation Many languages have morphological inflection to express agreement, gender, case, etc. English… not so much. Shows up in surface form of a word: prefix + stem + suffix (more or less. let’s not talk about infixes and circumfixes) Standard difficulty: data sparseness. (see fewer of each token)

Morphology in MT It’s problematic when morphological information in one half of a language pair is not present in the other half. Depending on the translation direction, you either have “extra” information that you need to learn to ignore, (easy) or you need to generate this extra information somehow (hard)

Too much morphology Easy hack: PRE-PROCESS! Strip out gender, split compounds, segment clitics– use as much perl as it takes.

Not enough morphology Current approaches mostly use a rich language model on the target side. Downside: just rescoring MT system output, not actually affecting the options. Factored translation: fold the morphological generation into the translation model– do it all during decoding. Downside: computationally expensive, so have to prune search space heavily– too much?

Was gibt es neues? The approach in this paper is to treat morphological inflection as a standalone (post)process: first, decode the input. then, for the sequence of word stems in the output, generate the most likely sequence of inflections given the original input. Experiments: English  Russian (1.6M sentence pairs), English  Arabic (460k sentence pairs)

Inflection prediction Lexicon determines three operations: Stemming: produce set of x possible stems S_w = {s^x} for a word w Inflection: produce set of y surface word forms I_w = {i^y} for the set of stems S_w Morphological Analysis: produce set of z morph. analyses A_w = {a^z} for a word w. each a is a vector of categorical values (POS, gender, etc).

Morphological analysis Morphological features: 7 for Russian ( including capitalization!) 12 for Arabic Each word can be factored into a stem + a subset of the morph features. Average 14 inflections / stem in Russian, 24 / stem in Arabic (!).

How do you get them? Arabic: Buckwalter analyzer Russian: Off-the-shelf lexicon Neither is exact, neither is domain- specific; there could be errors here. (Curse of MT: error propagation)

Models Prob of an inflection is product of local probabilities for each word, conditioned on context window (prior predictions): 5gram Russian, 3gram Arabic. Unlike the morphological analyzer which is just word-based, the inflection model can use arbitrary features/dependencies (such as projected treelet syntactic information)

Inflection Prediction Features Binary Pairs up the context (x, y_(t-1), y_(t-2),…) with the target label (y_t) Features can be anything!

Baseline experiments Stemmed the reference translations, try to predict the inflections. Done on 5k Russian sentences,1k Arabic (why?) Very good accuracy (91% +) Better than trigram LM (but how about 5gram for Russian?)

MT systems used 1. The Microsoft treelet translation system 2. Pharaoh reimplementation Trained on the MS corpus of technical manuals

Experiments: Translations are selected by the translation model, language model, and inflection model as follows: For each hypothesis in the n-best MT system output, select the best inflection. Then, for each input sentence, select the best inflected hypothesis.

Nbest lists Only tested up to n=100, but then optimized n and the interpolation weights via grid search. Optimum size of nbest list: Russian: 32 Arabic: 2 (!!)

Experiments 1. Train a regular MT system. stem the output, and re-inflect. 2. Train MT system, but stem the target language after alignment. The system output is now just word stems, so inflect. 3. Stem the parallel corpus, then train an MT system.