9/12/2003LTI Student Research Symposium1 An Integrated Phrase Segmentation/Alignment Algorithm for Statistical Machine Translation Joy Advisor: Stephan.

Slides:



Advertisements
Similar presentations
Statistical Machine Translation
Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart
Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.
Bayesian Learning of Non- Compositional Phrases with Synchronous Parsing Hao Zhang; Chris Quirk; Robert C. Moore; Daniel Gildea Z honghua li Mentor: Jun.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Chinese Word Segmentation Method for Domain-Special Machine Translation Su Chen; Zhang Yujie; Guo Zhen; Xu Jin’an Beijing Jiaotong University.
DP-based Search Algorithms for Statistical Machine Translation My name: Mauricio Zuluaga Based on “Christoph Tillmann Presentation” and “ Word Reordering.
June 2004 D ARPA TIDES MT Workshop Measuring Confidence Intervals for MT Evaluation Metrics Ying Zhang Stephan Vogel Language Technologies Institute Carnegie.
Measuring Confidence Intervals for MT Evaluation Metrics Ying Zhang (Joy) Stephan Vogel Language Technologies Institute School of Computer Science Carnegie.
The current status of Chinese- English EBMT -where are we now Joy (Ying Zhang) Ralf Brown, Robert Frederking, Erik Peterson Aug 2001.
Flow Network Models for Sub-Sentential Alignment Ying Zhang (Joy) Advisor: Ralf Brown Dec 18 th, 2001.
A Phrase-Based, Joint Probability Model for Statistical Machine Translation Daniel Marcu, William Wong(2002) Presented by Ping Yu 01/17/2006.
The current status of Chinese-English EBMT research -where are we now Joy, Ralf Brown, Robert Frederking, Erik Peterson Aug 2001.
TIDES MT Workshop Review. Using Syntax?  ISI-small: –Cross-lingual parsing/decoding Input: Chinese sentence + English lattice built with all possible.
ACL 2005 WORKSHOP ON BUILDING AND USING PARALLEL TEXTS (WPT-05), Ann Arbor, MI. June Competitive Grouping in Integrated Segmentation and Alignment.
Symmetric Probabilistic Alignment Jae Dong Kim Committee: Jaime G. Carbonell Ralf D. Brown Peter J. Jansen.
MT Summit VIII, Language Technologies Institute School of Computer Science Carnegie Mellon University Pre-processing of Bilingual Corpora for Mandarin-English.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.
Parameter estimate in IBM Models: Ling 572 Fei Xia Week ??
ABC--- A Phrase-to-Phrase Alignment Method Integrating monolingual and bilingual information in sub sentential phrase alignment Ying Zhang (Joy)
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
Does Syntactic Knowledge help English- Hindi SMT ? Avinesh. PVS. K. Taraka Rama, Karthik Gali.
Jan 2005Statistical MT1 CSA4050: Advanced Techniques in NLP Machine Translation III Statistical MT.
Statistical Alignment and Machine Translation
An Introduction to SMT Andy Way, DCU. Statistical Machine Translation (SMT) Translation Model Language Model Bilingual and Monolingual Data* Decoder:
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
Technical Report of NEUNLPLab System for CWMT08 Xiao Tong, Chen Rushan, Li Tianning, Ren Feiliang, Zhang Zhuyu, Zhu Jingbo, Wang Huizhen
Direct Translation Approaches: Statistical Machine Translation
Statistical Machine Translation Part IV – Log-Linear Models Alex Fraser Institute for Natural Language Processing University of Stuttgart Seminar:
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
Coping with Surprise: Multiple CMU MT Approaches Alon Lavie Lori Levin, Jaime Carbonell, Alex Waibel, Stephan Vogel, Ralf Brown, Robert Frederking Language.
Recent Major MT Developments at CMU Briefing for Joe Olive February 5, 2008 Alon Lavie and Stephan Vogel Language Technologies Institute Carnegie Mellon.
The ICT Statistical Machine Translation Systems for IWSLT 2007 Zhongjun He, Haitao Mi, Yang Liu, Devi Xiong, Weihua Luo, Yun Huang, Zhixiang Ren, Yajuan.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.
Information Retrieval at NLC Jianfeng Gao NLC Group, Microsoft Research China.
Chinese Word Segmentation Adaptation for Statistical Machine Translation Hailong Cao, Masao Utiyama and Eiichiro Sumita Language Translation Group NICT&ATR.
Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
LREC 2008 Marrakech 29 May Caroline Lavecchia, Kamel Smaïli and David Langlois LORIA / Groupe Parole, Vandoeuvre-Lès-Nancy, France Phrase-Based Machine.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
Coping with Surprise: Multiple CMU MT Approaches Alon Lavie Lori Levin, Jaime Carbonell, Alex Waibel, Stephan Vogel, Ralf Brown, Robert Frederking Language.
(Statistical) Approaches to Word Alignment
A New Approach for English- Chinese Named Entity Alignment Donghui Feng Yayuan Lv Ming Zhou USC MSR Asia EMNLP-04.
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.
Towards Syntactically Constrained Statistical Word Alignment Greg Hanneman : Advanced Machine Translation Seminar April 30, 2008.
Large Vocabulary Data Driven MT: New Developments in the CMU SMT System Stephan Vogel, Alex Waibel Work done in collaboration with: Ying Zhang, Alicia.
September 2004CSAW Extraction of Bilingual Information from Parallel Texts Mike Rosner.
CMU MilliRADD Small-MT Report TIDES PI Meeting 2002 The CMU MilliRADD Team: Jaime Carbonell, Lori Levin, Ralf Brown, Stephan Vogel, Alon Lavie, Kathrin.
January 2012Spelling Models1 Human Language Technology Spelling Models.
MEMT: Multi-Engine Machine Translation Faculty: Alon Lavie, Robert Frederking, Ralf Brown, Jaime Carbonell Students: Shyamsundar Jayaraman, Satanjeev Banerjee.
Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,
DARPA TIDES MT Group Meeting Marina del Rey Jan 25, 2002 Alon Lavie, Stephan Vogel, Alex Waibel (CMU) Ulrich Germann, Kevin Knight, Daniel Marcu (ISI)
METEOR: Metric for Evaluation of Translation with Explicit Ordering An Improved Automatic Metric for MT Evaluation Alon Lavie Joint work with: Satanjeev.
Chinese Academy of Sciences, Beijing, China
Alexander Fraser CIS, LMU München Machine Translation
Statistical Machine Translation Part III – Phrase-based SMT / Decoding
CSCI 5832 Natural Language Processing
Statistical Machine Translation Papers from COLING 2004
The XMU SMT System for IWSLT 2007
Statistical Machine Translation Part VI – Phrase-based Decoding
Presented By: Sparsh Gupta Anmol Popli Hammad Abdullah Ayyubi
Presentation transcript:

9/12/2003LTI Student Research Symposium1 An Integrated Phrase Segmentation/Alignment Algorithm for Statistical Machine Translation Joy Advisor: Stephan Vogel and Alex Waibel

9/12/2003LTI Student Research Symposium2 Outline Background Phrase Alignment Algorithms in SMT Segmentation Approaches Integrated Segmentation and Alignment Algorithm (ISA) Experiments Discussions

9/12/2003LTI Student Research Symposium3 Statistical Machine Translation Statistical Machine Translation (Brown et al, 93) –Noisy Channel Model Translating from F to E Given a testing sentence f, generate translation e*, which is Pr(e): Language Model (LM) Pr(f|e): Translation Model (TM)

9/12/2003LTI Student Research Symposium4 Training –Training Using large English corpora (e.g. Wall Street Journal) to train an LM Using bilingual corpora (e.g. Canadian Hansard) to train the TM –To get the building blocks for Pr(f|e) »Word to word translation or phrase to phrase translations »Reordering information »Other features

9/12/2003LTI Student Research Symposium5 Alignment Alignment for one sentence pair (e,f): –Suppose e has l words: and f has m words –Then alignment a can be represented as: Of m values, each between 0 and l. a j =i means f j is “aligned” to e i, where e 0 stands for NULL word –In short: alignment tells us which word in e is translated into which word in f

9/12/2003LTI Student Research Symposium6 Alignment Example

9/12/2003LTI Student Research Symposium7 Alignment Models Alignment algorithms: –IBM model 1 to 5 (Brown et al.) –HMM model similar to IBM2 (Vogel) –Competitive linking (Melamed) –Flow Network (Gaussier) –Others

9/12/2003LTI Student Research Symposium8 IBM Model 1 IBM model 1 –Easy to train –Simple to understand –Used very often in MT research –One serious problem for IBM models Word-to-word alignment assumption

9/12/2003LTI Student Research Symposium9 Phrase-to-phrase Alignment Phrase-to-phrase alignment is better –Mismatch between languages –Phrases encapsulate the context of words –Phrases encapsulate local reordering

9/12/2003LTI Student Research Symposium10 Outline Background Phrase Alignment Algorithms in SMT Segmentation Approaches Integrated Segmentation and Alignment Algorithm (ISA) Experiments Discussions

9/12/2003LTI Student Research Symposium11 Alignment Algorithms Based on initial word alignment –Train word alignment –Read off phrase-to-phrase alignments from Viterbi path –Examples: HMM phrase alignment (Vogel) Alignment templates from IBM 4 (Och) Bilingual bracketing (Wu, B. Zhao) Popular in SMT research

9/12/2003LTI Student Research Symposium12 Outline Background Phrase Alignment Algorithms in SMT Segmentation Approaches Integrated Segmentation and Alignment Algorithm (ISA) Experiments Discussions

9/12/2003LTI Student Research Symposium13 Segmentation Approaches Identify monolingual phrases and segment/bracket phrases into one unit (super-word) (Zhang 2000) Train the regular word-to-word alignment

9/12/2003LTI Student Research Symposium14 Problems in Segmentation Approaches Segmentation uses only monolingual information Good segmentations may make alignment even harder 

9/12/2003LTI Student Research Symposium15 Outline Background Alignment Algorithms in SMT Segmentation Approaches Integrated Segmentation and Alignment Algorithm (ISA) Experiments Discussions

9/12/2003LTI Student Research Symposium16 Integrated Segmentation and Alignment Let’s look at an example first

9/12/2003LTI Student Research Symposium17 Integrated Segmentation and Alignment Represent a sentence pair (e,f) as a matrix D D(i,j) = I’(e i,f j ). I’ is a modified point-wise mutual information A partition over D is a series of non-overlapping rectangle regions d 1, d 2,…,d m. Region d k (r s,r e,c s,c e ) indicates: are aligned to Segmentation and alignment are achieved at the same time

9/12/2003LTI Student Research Symposium18 Integrated Segmentation and Alignment Best partition should yield maximum Computationally intractable to search all possible partitions –Exponential to sentence length –DP: not a good idea. An optimal policy has the property that whatever the initial state and the initial decisions are, the remaining decisions must constitute an optimal policy with regard to the state resulting from the first decision. -- Richard Bellman's Principle of OptimalityRichard Bellman But here, decision of how to expand the first cell changes the search space for the rest of the cells Using a computationally cheap algorithm to find the “good” partitions

9/12/2003LTI Student Research Symposium19 An Example

9/12/2003LTI Student Research Symposium20 Computational Cheap Algorithm Assumption: –if the translation for e 1 e 2 is f, I’(e 1, f) should be very “similar” to I’(e 2, f). –Example: Algorithm –Step1: find the cell in D with max value of I’ –Step2: expand this cell to a rectangle region where all cells in the region has similar I’ as this cell –Repeat Step1 and Step2 until no more regions can be found

9/12/2003LTI Student Research Symposium21 Example: Apply the Algorithm

9/12/2003LTI Student Research Symposium22 Estimate the probabilities for phrase translations The decoder needs the conditional probabilities P(f|e) Can not be estimated directly: data sparseness Convert I’(f,e) to P(f|e) IBM model 1 style: Context-dependent style where: and

9/12/2003LTI Student Research Symposium23 Outline Background Phrase Alignment Algorithms in SMT Segmentation Approaches Integrated Segmentation and Alignment Algorithm (ISA) Experiments Discussions

9/12/2003LTI Student Research Symposium24 Experiments Chinese-English Small data track Evaluation: NIST score against 4 human references SentencesChinese Words English Words Training3540 pairs90 K115 K Testing99326 K

9/12/2003LTI Student Research Symposium25 Results Baseline: IBM model1 + HMM phrase Compare to using ISA only, and ISA+Baseline PrecLength Penalty FinalScore Baseline ISA ISA+Baseline

9/12/2003LTI Student Research Symposium26 T-test Student's t-test at the sentence level Precision ScoresFinal Scores t-valueConfidence Level t-valueConfidence Level ISA vs. Baseline % % ISA+Baseline vs. Baseline % %

9/12/2003LTI Student Research Symposium27 Compared to IBM1 Using 20M words LM LDC+IBM NIST= LenPenalty= LDC+IBM+ISA NIST= LenPenalty= Incr #Type%Contrib.#Type%Contrib. 1-gram gram gram gram gram Sum Large data track (2.6M English words, 414K Chinese words)

9/12/2003LTI Student Research Symposium28 No IBM1 is Better Small data track (LDC+IBM1+ISA) ISA is better even on unigram match than IBM1 W IBM1 NIST1-gram score 2-gram score 3-gram score (no IBM1)

9/12/2003LTI Student Research Symposium29 Summary Integrated Alignment and Segmentation Simple algorithm Enhanced translation quality –Better than IBM models –Higher quality than HMM alignment A major component in the CMU SMT system

9/12/2003LTI Student Research Symposium30 ISA Toolkit Location: –/afs/cs.cmu.edu/user/joy/Release/PhraseAlign Documentation: –/afs/cs.cmu.edu/user/joy/Release/PhraseAlign/documen tation/readme.txt Speed –Example: 4172 sentence pairs (133K En words, 20K Ch words) –About 160 seconds for the alignment (10 loops for each sentence pair)

9/12/2003LTI Student Research Symposium31 Selected References Franz Josef Och, Christoph Tillmann, Hermann Ney, “Improved Alignment Models for Statistical Machine Translation,” Proceedings of the Joint Conference of Empirical Methods in Natural Language Processing and Very Large Corpora, pp University of Maryland, College Park, MD, June Stephan Vogel, Hermann Ney, and Christoph Till-mann, “HMM-based Word Alignment in Statistical Translation,” Proceedings of COLING '96: The 16th International Conference on Computational Linguistics, pp Copenhagen, August Stephan Vogel, Ying Zhang, Fei Huang, Alicia Tribble, Ashish Venogupal, Bing Zhao, Alex Waibel, “The CMU Statistical Translation System,” to appear in the Proceedings of MT Summit IX, New Orleans, LA, U.S.A., September 2003.Ying ZhangMT Summit IX, Ying Zhang, Ralf D. Brown, Robert E. Frederking and Alon Lavie, “Pre-processing of Bilingual Corpora for Mandarin-English EBMT,” Proceedings of MT Summit VIII, Santiago de Compostela, Spain, September Ying Zhang, Stephan Vogel, Alex Waibel, "Integrated Phrase Segmentation and Alignment Algorithm for Statistical Machine Translation," in the Proceedings of International Conference on Natural Language Processing and Knowledge Engineering (NLP-KE'03), Beijing, China, October 2003.Ying Zhang