Presentation on theme: "F ROM T RANSLATION M ACHINE T HEORY TO M ACHINE T RANSLATION T HEORY – SOME INITIAL T HOUGHTS Oliver Čulo Universität Mainz"— Presentation transcript:
F ROM T RANSLATION M ACHINE T HEORY TO M ACHINE T RANSLATION T HEORY – SOME INITIAL T HOUGHTS Oliver Čulo Universität Mainz
MT AS T RANSLATION M ACHINE T HEORY
T OPICS OF ( EARLY ) SMT Calculating translation models (Brown et al. 1993) sentence alignment (Gale & Church 1991) word alignment (Och & Ney 2003) …and a plethora of papers on how to improve these
R ECENT RESURGENCE OF LINGUISTICS MT and the phrase (Fox 2002, Koehn et al. 2003, Eisele 2006) MT and dependency (Ding & Palmer 2005, Quirk et al. 2005, Žabokrtský et al. 2008) hybrid architectures (Eisele et al. 2008) domain adaption (Koehn & Schroeder 2007, Bertoldi & Federico 2009) factored models (Koehn & Hoang 2007) …
T RANSLATION - THEORETIC MODELLING OF MT
MT AND F UNCTIONAL T RANSLATION T HEORY (1) Skopos theory (Reiss & Vermeer 1984) pragmalinguistic model (House 1997), function and loyalty (Nord 1997, 2006) functional equivalence change in function documentary instrumental over covert
MT AND F UNCTIONAL T RANSLATION T HEORY (2) aimed at functional equivalence (but does a machine or a GT user know?) aimed at instrumental (but in fact rather documentary; ethical dimensions?)
MT AND F UNCTIONAL T RANSLATION T HEORY (3) MT and its lack of translation–functional considerations in system design (Schmidt in print) human, purposeful action-theoretic conception of translation as hindrance to acceptance of MT (Rozmyslowicz in print)
K NOWLEDGE TRANSFER TS -> MT
C RO C O
C RO C O STRUCTURE : MULTILINGUAL Register-controlled CorpusTranslation Corpus Word layer Chunk layer Clause layer Sentence layer + Metainformation + PoS tagging + Morphology + Sense relations + Phrase structure + Grammatical functions Alignment layers
Tray 1holds In Fach 1könnenbis zu 125 Blatt Papiereingelegt werden PROBJ SUBJ FIN PRED 12 up to 125 sheets DOBJ F UNCTION S HIFTS (T YPOLOGICAL D IFFERENCES )
F UNCTION S HIFTS PER R EGISTER AND T RANSLATION D IRECTION
G RAMMATICAL FUNCTIONS IN THEME POSITION
MT AND T RANSLATION F ACTORS : R EGISTER AND T RANSLATION D IRECTION often spoken of domains, but that term is too vague Kurokawa et al. (2009) – training translation models according to translation direction (A), and without (B) – for a performance of (A) equivalent to (B), they needed only ca. 1/5 of the data size feature selection problem: which feature per register and translation direction (e.g. Diwersy et al. 2013, also an overview in Oakes & Ji 2012)
P OST -E DITING
I NCREASING ROLE OF MT IN TRANSLATION MT integrated into Translation Memories, many translation workflows (SDL 2011, Bajon et al. 2012, OBrien 2012) as MT needs to be post-edited, in consequence post-editing becomes a more and more important component of the translators job profile
CRITT TPR D ATABASE project coordinator: Copenhagen Business School English-German data collection at FTSK in Germersheim translation vs. post-editing vs. (blind) editing 6 source texts (ST) with different complexity levels (Hvelplund 2011) 12 professional translators, 12 semi-professional translators MT system: Google Translate eye-tracking (Tobii TX 300), key-logging (Translog II), retrospective questionnaires
E YE -T RACKING AND K EY -L OGGING P OST - EDITING
P ROCESSING T IMES cf. Carl, Gutermuth & Hansen-Schirra in print
P ROCESSING S TYLES Time Word number Time Word number
P ROCESSING P ATTERNS Time Word number Time Word number
I NTERFERENCE ST: In a gesture sure to rattle the Chinese Government, Steven Spielberg pulled out of the Beijing Olympics to protest against China's backing for Sudan's policy in Darfur. HT: Als Zeichen des Widerstands gegen die Chinesische Regierung... As sign the-GEN. resistance against the Chinese government…
L ACK OF C ONSISTENCY ST: Killer nurse receives four life sentences. Hospital nurse C.N. was imprisoned for life today for the killing of four of his patients. PE: Killer-Krankenschwester zu viermal lebenslanger Haft verurteilt. Der Krankenpfleger C.N. wurde heute auf Lebenszeit eingesperrt für die Tötung von vier seiner Patienten. Killer nurse.FEM to four times lifetime imprisonment sentenced. The nurse.MASC C.N. was today on lifetime imprisoned for the killing of four his.GEN patients.
C ONCLUSIONS AND S UGGESTIONS
F UTURE WORK Entrenchment of MT in TS (theory): – common ground – more acceptance – improved description of MT workflow for the translator – imrpoved task descriptions for PE
S OME TENTATIVE SUGGESTIONS TO OURSELVES FOR BETTER TASK DESCRIPTION BASED ON TRANSLATOR CONCEPTS Task descriptionFunction of the text (e.g. Nord 2006, House 1997) terminologicalidiomaticity As little as possible (rapid PE) documentaryConceptually equivalent, non- terms but also dispreferred or deprecated terms may be used Unidiomatic, but understandable wording may remain (disambiguated at word level!) As much as possible (full PE) Covert instrumental Only allowed terms can be used Phraseology according to the domain Intermediate levelsOvert instrumental (usable, but identifiable as translation) Only terms, but also dispreferred and maybe deprecated Idiomatic, but also non-standard phraseology
T HANK Y OU FOR Y OUR A TTENTION !... AND Y OUR Q UESTIONS, C OMMENTS,...
R EFERENCES (1) Bertoldi, Nicola, and Marcello Federico Domain Adaption for Statistical Machine Translation with Monolingual Resources. In Proceedings of the Fourth Workshop on Statistical Machine Translation, 182– 189. Athens, Greece: Association for Computational Linguistics. Brown, Peter E., Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer The Mathematics of Statistical Machine Translation: Parameter Estimation. Computational Linguistics 2 (19): 263–311. Eisele, Andreas Parallel Corpora and Phrase-based Statistical Machine Translation for New Language Pairs via Multiple Intermediaries. In 5th International Conference on Language Resources and Evaluation (LREC) Eisele, Andreas, Christian Federmann, Hans Uszkoreit, Saint-Amand Hervé, Martin Kay, Michael Jellinghaus, Sabine Hunsicker, Teresa Herrmann, and Yu Chen Hybrid Architectures for Multi-Engine Machine Translation. In Translating and the Computer 30. London, UK. Fox, Heidi J Phrasal Cohesion and Statistical Machine Translation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), 304–11. Philadelphia: ACL. Gale, William A, and Kenneth W Church A Program for Aligning Sentences in Bilingual Corpora. Computational Linguistics 19 (1): 75–102. House, Juliane Translation Quality Assessment. A Model Revisited. Tübingen: Gunter Narr Verlag. Koehn, Philipp, Franz Josef Och, and Daniel Marcu Statistical Phrase-Based Translation. In Proceedings of HLT-NAACL 2003, 127–133. Koehn, Philipp, and Josh Schroeder Experiments in Domain Adaptation for Statistical Machine Translation. In ACL Workshop on Machine Translation 2007.
R EFERENCES (2) Kurokawa, David, Cyril Goutte, and Pierre Isabelle Automatic Detection of Translated Text and Its Impact on Machine Translation. Proceedings. MT Summit XII, The Twelfth Machine Translation Summit International Association for Machine Translation Hosted by the Association for Machine Translation in the Americas. Lapshinova-Koltunski, Ekaterina VARTRA: A Comparable Corpus for the Analysis of Translation Variation. In Proceedings of the 6th Workshop on Building and Using Comparable Corpora, 77–86. Sofia, Bulgaria. Lembersky, Gennadi, Noam Ordan, and Shuly Wintner Language Models for Machine Translation: Original Vs. Translated Texts. Computational Linguistics 38 (4): 799–825. Nord, Christiane Translating as a Purposeful Activity. Functionalist Approaches Explained. Translation Theories Explained 1. Manchester: Jerome Translating for Communicative Purposes Across Culture Boundaries. Journal of Translation Studies 9 (1): 43–60. Och, Franz-Josef, and Hermann Ney A Systematic Comparison of Various Statistical Alignment Models. Computational Linguistics 29 (1): 19–51. Reiss, Katharina, and Hans J. Vermeer Grundlegung Einer Allgemeinen Translationstheorie. Linguistische Arbeiten 147. Tübingen: M. Niemeyer.