Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Translation Diana Trandab ă ţ Academic Year 2015-2016.

Similar presentations


Presentation on theme: "Machine Translation Diana Trandab ă ţ Academic Year 2015-2016."— Presentation transcript:

1 Machine Translation Diana Trandab ă ţ Academic Year 2015-2016

2 Course overview Approaches to MT Language Model Translation model Statistical modeling and IBM Models EM algorithm Word alignment Phrase-based translation Syntax-based translation Reordering Decoding Evaluation

3 Prerequisites WILL TO LEARN!!!

4 Minimum expectations LEARN something Adequately use the machine translation terminology Create language models Develop and/or implement translation models Better presentations skills DO something Assignments Project TEACH me something

5 Evaluation Laboratory – 100 points – Attendance(10%) – Homework (90%) Project – 100 points Exam – 100 points – Midterm – Final exam

6 Homework ~ Weekly In class delivery – 50% of the points for delivery in class; 50% for email submitted homework Late delivery for email submissions – 100% of the points for delivery on time, 80% of the points for 1 day late delivery, 60% of the points for 2 days late delivery, … Name convention: MT_HomeworkNO_StudentName_ProgrammingLanguage Each implementation task is submitted with a short documentation (max. 1 page) with implementation details, challenges, methods/solutions, errors, problems etc.

7 Projects We’ll get to that latter…

8 What I expect you to know after today What is machine translation What is statistical machine translation Problems of machine translation

9 What I expect you to know after today What is machine translation What is statistical machine translation Problems of machine translation We are not alone in the universe!?

10 How do humans translate?

11 Spend years learning a new language – memorizing words – learning syntactic patterns – exercising – … Use dictionaries and detailed world knowledge to: – Identify meaning – Find proper words to use in new language – Produce a syntactically correct text – Preserve meaning ….

12 What is machine translation? Translation performed using a machine/computer

13 How do machines translate? Flowers are lovely!

14 How do machines translate? Using available resources: Electronic bilingual dictionary Templates, transfer rules: Thesaurus, WordNet, FrameNet, … Parallel data, comparable data Using available NLP tools tokenizer, morphological analyzer, syntactic parser, …  More resources for major languages, less for “minor” languages.

15 How do machines translate?

16 Statistical machine translation

17 very large data set of good translations automatically infer a statistical model of translation apply the translation model to new texts to guess a reasonable translation

18 Statistical machine translation very large data set of good translations automatically infer a statistical model of translation apply the translation model to new texts to guess a reasonable translation

19 Noisy channel

20

21 Language Model P(e) Takes care of fluency in the target language Data: corpora in the target language Translation Model P(f|e) Lexical faithful correspondence between languages Data: aligned corpora in source and target languages argmax Search done by the decoder Noisy channel

22 Accurate vs. Fluent Often impossible to have a true translation; one that is both: – Faithful to the source language, and – Fluent in the target language Japanese: “fukaku hansei shite orimasu” Fluent translation: “we apologize” Faithful translation: “we are deeply reflecting (on our past behaviour, and what we did wrong, and how to avoid the problem next time)” Need to compromise between faithfulness & fluency

23 Accurate vs. Fluent Often impossible to have a true translation; one that is both: – Faithful to the source language, and – Fluent in the target language Japanese: “fukaku hansei shite orimasu” Fluent translation: “we apologize” Faithful translation: “we are deeply reflecting (on our past behaviour, and what we did wrong, and how to avoid the problem next time)” Need to compromise between faithfulness & fluency

24 Accurate vs. Fluent Often impossible to have a true translation; one that is both: – Faithful to the source language, and – Fluent in the target language Japanese: “fukaku hansei shite orimasu” Fluent translation: “we apologize” Faithful translation: “we are deeply reflecting (on our past behaviour, and what we did wrong, and how to avoid the problem next time)” Need to compromise between faithfulness & fluency

25 Question What is your input on clients which sell pharmaceuticals in Europe?

26 Group activity

27 CENTAURIARCTURAN Ok-voon ororok sprok.At-voon bichat dat. Ok-drubel ok-voon anok plok sprok.At-drubel at-voon pippat rrat dat. Erok sprok izok hihok ghirok.Totat dat arrat vat hilat. Ok-voon anok drok brok jok.At-voon krat pippat sat lat. Wiwok farok izok stok.Totat jjat quat cat. Lalok sprok izok jok stok.Wat dat krat quat cat. Lalok farok ororok lalok sprok izok enemok.Wat jjat bichat wat dat vat eneat. Lalok brok anok plok nok.Iat lat pippat rrat nnat. Wiwok nok izok kantok ok-yurp.Totat nnat quat oloat at-yurp. Lalok mok nok yorok ghirok clok.Wat nnat gat mat bat hilat Lalok nok crrrok hihok yorok zanzanok.Wat nnat arrat mat zanzanat. Lalok rarok nok izok hihok mok.Wat nnat forat arrat vat gat.

28 What we’ve learned Direct (word-by-word) translation Reordering Different word alignment 1:1, 0:1, 1:0, etc. Translation model

29 Question What is your input on clients which sell pharmaceuticals in Europe?

30 References Philipp Koehn: Statistical machine translation. Cambridge University Press. xii, 433pp, 2009 Yorick Wilks: Machine translation: its scope and limits. New York: Springer. x, 252pp, 2009 John Hutchins “Machine translation: general overview”. Chapter 27 of R Mitkov (ed.) The Oxford Handbook of Computational Linguistics, Oxford (2004) Harold Somers “Machine Translation”. Chapter 13 of R Dale, H Moisl & H Somers (eds) Handbook of Natural Language Processing, New York (2000): Marcel Dekker Nico Weber (ed.): Machine translation: theory, applications, and evaluation. An assessment of the state-of-the-art St.Augustin: Gardez! Verlag, 1998 Kishore Papineni et. al.: Bleu: a Method for Automatic Evaluation of Machine Translation, ACL '02 Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Pages 311-318, 2002.

31 “One naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: ‘This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.’ ” Warren Weaver (1947)

32 See you next time!

33 Noisy channel


Download ppt "Machine Translation Diana Trandab ă ţ Academic Year 2015-2016."

Similar presentations


Ads by Google