999999-1 XYZ 10/12/2014 MIT Lincoln Laboratory The MITLL/AFRL IWSLT-2007 MT System Wade Shen, Brian Delaney, Tim Anderson and Ray Slyh 27 November 2006.

Slides:



Advertisements
Similar presentations
© Jim Barritt 2005School of Biological Sciences, Victoria University, Wellington MSc Student Supervisors : Dr Stephen Hartley, Dr Marcus Frean Victoria.
Advertisements

Using Matrices in Real Life
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
and 6.855J Cycle Canceling Algorithm. 2 A minimum cost flow problem , $4 20, $1 20, $2 25, $2 25, $5 20, $6 30, $
Summary of Convergence Tests for Series and Solved Problems
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
0 - 0.
ADDING INTEGERS 1. POS. + POS. = POS. 2. NEG. + NEG. = NEG. 3. POS. + NEG. OR NEG. + POS. SUBTRACT TAKE SIGN OF BIGGER ABSOLUTE VALUE.
MULTIPLICATION EQUATIONS 1. SOLVE FOR X 3. WHAT EVER YOU DO TO ONE SIDE YOU HAVE TO DO TO THE OTHER 2. DIVIDE BY THE NUMBER IN FRONT OF THE VARIABLE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Around the World AdditionSubtraction MultiplicationDivision AdditionSubtraction MultiplicationDivision.
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
ABC Technology Project
3 Logic The Study of What’s True or False or Somewhere in Between.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
CS 6143 COMPUTER ARCHITECTURE II SPRING 2014 ACM Principles and Practice of Parallel Programming, PPoPP, 2006 Panel Presentations Parallel Processing is.
Thomas Jellema & Wouter Van Gool 1 Question. 2Answer.
VOORBLAD.
15. Oktober Oktober Oktober 2012.
Benchmark Series Microsoft Excel 2013 Level 2
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Hypothesis Tests: Two Independent Samples
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Chapter 5 Microsoft Excel 2007 Window
© 2012 National Heart Foundation of Australia. Slide 2.
Adding Up In Chunks.
Lets play bingo!!. Calculate: MEAN Calculate: MEDIAN
Past Tense Probe. Past Tense Probe Past Tense Probe – Practice 1.
The University of Washington Machine Translation System for the IWSLT 2007 Competition Katrin Kirchhoff & Mei Yang Department of Electrical Engineering.
Addition 1’s to 20.
25 seconds left…...
Subtraction: Adding UP
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Januar MDMDFSSMDMDFSSS
Week 1.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
1 Interpreting a Model in which the slopes are allowed to differ across groups Suppose Y is regressed on X1, Dummy1 (an indicator variable for group membership),
Local Search Jim Little UBC CS 322 – CSP October 3, 2014 Textbook §4.8
CPSC 322, Lecture 14Slide 1 Local Search Computer Science cpsc322, Lecture 14 (Textbook Chpt 4.8) Oct, 5, 2012.
CSE Lecture 17 – Balanced trees
PSSA Preparation.
Self-training with Products of Latent Variable Grammars Zhongqiang Huang, Mary Harper, and Slav Petrov.
“Applying Morphology Generation Models to Machine Translation” By Kristina Toutanova, Hisami Suzuki, Achim Ruopp (Microsoft Research). UW Machine Translation.
TIDES MT Workshop Review. Using Syntax?  ISI-small: –Cross-lingual parsing/decoding Input: Chinese sentence + English lattice built with all possible.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.
1 The LIG Arabic / English Speech Translation System at IWSLT07 Laurent BESACIER, Amar MAHDHAOUI, Viet-Bac LE LIG*/GETALP (Grenoble, France)
Coşkun Mermer, Hamza Kaya, Mehmet Uğur Doğan National Research Institute of Electronics and Cryptology (UEKAE) The Scientific and Technological Research.
NUDT Machine Translation System for IWSLT2007 Presenter: Boxing Chen Authors: Wen-Han Chao & Zhou-Jun Li National University of Defense Technology, China.
Ibrahim Badr, Rabih Zbib, James Glass. Introduction Experiment on English-to-Arabic SMT. Two domains: text news,spoken travel conv. Explore the effect.
NRC Report Conclusion Tu Zhaopeng NIST06  The Portage System  For Chinese large-track entry, used simple, but carefully- tuned, phrase-based.
Large Vocabulary Data Driven MT: New Developments in the CMU SMT System Stephan Vogel, Alex Waibel Work done in collaboration with: Ying Zhang, Alicia.
Fabien Cromieres Chenhui Chu Toshiaki Nakazawa Sadao Kurohashi
Presentation transcript:

XYZ 10/12/2014 MIT Lincoln Laboratory The MITLL/AFRL IWSLT-2007 MT System Wade Shen, Brian Delaney, Tim Anderson and Ray Slyh 27 November 2006 This work is sponsored by the United States Air Force Research Laboratory under Air Force Contract FA C Opinions, interpretations, conclusions and recommendations are those of the author and are not necessarily endorsed by the United States Government.

MIT Lincoln Laboratory WS 10/12/2014 Statistical Translation System Experimental Architecture Standard Statistical Architecture New this year –Light Morphology for Arabic –Better Speech Decoders Lattice-based decoder Better conf-net decoding w/ moses –Rescoring Features Participated in –Chinese  English –Arabic  English –Italian  English Model Training Translation Alignment Expansion GIZA++/CLA Word Alignment ChEn Training Bitext Phrase Extraction Minimum Error Rate Training Decode Rescore Ch Test Set En Translated Output ChEn Dev Set

MIT Lincoln Laboratory WS 10/12/2014 The MITLL/AFRL MT System Overview Light Arabic Morphology Improved Confusion Network Decoding Direct Lattice Decoding –Reordering Constraints –Higher-order LMs Experiments –Lattice vs. Confusion Network Decoding –Arabic Preprocessing Summary

MIT Lincoln Laboratory WS 10/12/2014 Light Arabic Morphology AP5 Process Hamza Normalization Tanween Normalization Proclitic/Conjunction/Article Splitting Stemming and normalization of enclitics Normalization of tense markers (sa)

MIT Lincoln Laboratory WS 10/12/2014 Light Arabic Morphology Effect Marker (post) used to disambiguate suffixes and prefixes Reduce OOV Rate: 12.3% → 7.26% Regularized affix and suffix forms ستحط ال طائرة خلال ساعة ستحط سنقدم وجبة الغذاء بعد ثلاثين دقيقة من الإقلاع الحمام في مؤخرة الطائرة اتبعني رجاء ستحط ال طائرة خلال ساعة تقريبا سنقدم و جبة ال غذاء ب عد ثلاثين دقيقة من ال إقلاع ال حمام ف ي مؤخرة ال طائرة اتبع ني رجاء No Processing AP5 Processing

MIT Lincoln Laboratory WS 10/12/2014 The MITLL/AFRL MT System Overview Light Arabic Morphology Improved Confusion Network Decoding Direct Lattice Decoding –Reordering Constraints –Higher-order LMs Experiments –Lattice vs. Confusion Network Decoding –Arabic Preprocessing Summary

MIT Lincoln Laboratory WS 10/12/2014 Confusion Network Repunctuation No longer rely on 1-best repunctuation alone Process –Convert lattice to confusion network –Insert punctuation between columns using all possible n- gram contexts surrounding current column –Sum Posteriors of different contexts per punctuation mark Significantly more processing requirements –Average: n k where n is n-gram order of punctuation model and k is average column depth

MIT Lincoln Laboratory WS 10/12/2014 Improved Confusion Network Decoding Use of component ASR scores –No longer rely on ASR posterior and fixed scaling –Expose Source LM and acoustic model scores MER Training with ASR scores –Interaction of source/target word penalties –Use ASR path posterior to optimize source word penalty i.e. optimize E(source length) Results in improved performance in all languages

MIT Lincoln Laboratory WS 10/12/2014 The MITLL/AFRL MT System Overview Light Arabic Morphology Improved Confusion Network Decoding Direct Lattice Decoding –Reordering Constraints –Higher-order LMs Experiments –Lattice vs. Confusion Network Decoding –Arabic Preprocessing Summary

MIT Lincoln Laboratory WS 10/12/2014 As an alternative to decoding on confusion networks, we perform direct decoding of ASR lattices using finite state transducers The target language hypothesis is the best path through the following transducer: Direct ASR Lattice Decoding using Finite State Transducers where, –I = weighted source language input acceptor –P = phrase segmentation transducer –D = weighted phrase swapping transducer –T = weighted phrase translation transducer (source phrases to target words) –L = weighted target language model acceptor

MIT Lincoln Laboratory WS 10/12/2014 FST Decoder Implementation Based on MIT FST toolkit: Phrase swapping transducer can be applied twice for long distance reordering  inefficient but simple Pruning strategy –Apply wide beam on full path scores after composition with T –Viterbi search with narrow beam during language model search OOV words are detected and added as parallel paths to P, T, and L transducers  OOV penalty discourages OOV words when multiple paths exist Minimum error rate training requires some extra work to recover individual model parameters for weight optimization

MIT Lincoln Laboratory WS 10/12/2014 Model Parameters For ASR Lattice Decoding

MIT Lincoln Laboratory WS 10/12/2014 Minimum Error Rate Training with FST Decoder Lattices (SLF) Convert to FSA Lat. weights Insert Punctuation Source LM w/punc Pruning FST Decoder TL N-best Generation Retrieve Model Params Add Rescoring Features Merge Feature Lists MERT Build Translation Transducer slm weight tm weights Dist. & OOV New Weights Repeat? vocab Phrase table

MIT Lincoln Laboratory WS 10/12/2014 The MITLL/AFRL MT System Overview Light Arabic Morphology Improved Confusion Network Decoding Direct Lattice Decoding –Reordering Constraints –Higher-order LMs Experiments –Lattice vs. Confusion Network Decoding –Arabic Preprocessing Summary

MIT Lincoln Laboratory WS 10/12/2014 ASR Lattice Decoding Results Language (dev/test)ConditionBLEU CE (dev4/dev5)Fixed ASR Lattice Weights16.98 CE (dev4/dev5)+Optimized ASR Lattice Weights17.23 CE (dev4/dev5)+Rescoring Features18.27 AE (dev4/dev5)Fixed ASR Lattice Weights21.55 AE (dev4/dev5)+Optimized ASR Lattice Weights22.70 AE (dev4/dev5)+Rescoring Features23.73 IE (dev4/dev5)Fixed ASR Lattice Weights29.45 IE (dev4/dev5)+Optimized ASR Lattice Weights29.42 IE (dev4/dev5)+Rescoring Features30.15 IE (dev5bp1/dev5bp2)Fixed ASR Lattice Weights16.25 IE (dev5bp1/dev5bp2)+Optimized ASR Lattice Weights17.06 IE (dev5bp1/dev5bp2)+Rescoring Features17.90

MIT Lincoln Laboratory WS 10/12/2014 Confusion Network Results Repunctuation Posterior vs. Separate AM and LM scores

MIT Lincoln Laboratory WS 10/12/2014 Confusion Network vs. Lattice Decoding Language (dev/test)ConditionBLEU CE (dev4/dev5)Confusion Network18.30 CE (dev4/dev5)Lattice Decoding18.27 AE (dev4/dev5)Confusion Network22.92 AE (dev4/dev5)Lattice Decoding23.73 IE (dev5bp1/dev5bp2)Confusion Network18.20 IE (dev5bp1/dev5bp2)Lattice Decoding17.90 All configurations use rescoring features Different models for distortion: phrase vs. word Similar performance for CE and IE Arabic improvement with lattice decoding –ConfNet posterior issues due to ASR mismatch (?)

MIT Lincoln Laboratory WS 10/12/2014 Arabic Morphology Improvement from each AP5 Processing Step Compare with ASVM and no morphology

MIT Lincoln Laboratory WS 10/12/2014 The MITLL/AFRL MT System Overview Light Arabic Morphology Improved Confusion Network Decoding Direct Lattice Decoding –Reordering Constraints –Higher-order LMs Experiments –Lattice vs. Confusion Network Decoding –Arabic Preprocessing Summary

MIT Lincoln Laboratory WS 10/12/2014 Summary Significant improvement for Arabic with light morphology –Five Deterministic Rules Improved Confusion Network decoding with separate source LM and ASR scores Lattice-based Decoding comparable to Confusion Network decoding –Improvement for Arabic Task