@ Anna Sågvall Hein 2005 An example of a good translation En inbyggd oljepump levererar olja under tryck både till hydraulsystemet och växellådans oljesystem.

Slides:



Advertisements
Similar presentations
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Advertisements

The Chinese Room: Understanding and Correcting Machine Translation This work has been supported by NSF Grants IIS Solution: The Chinese Room Conclusions.
Machine Translation II How MT works Modes of use.
CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Machine Translation MÖSG vt 2004 Anna Sågvall Hein.
Anna Sågvall Hein, GSLT, January 2003 A grammar rule SVE.GRAM CL.IMP :=: 'CL, :=: 'IMP, = 'VERB, :=: 'VERB, = 'IMP, :=:, :=:, :=:, ADVANCE,
ANTLR in SSP Xingzhong Xu Hong Man Aug Outline ANTLR Abstract Syntax Tree Code Equivalence (Code Re-hosting) Future Work.
Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Anna Sågvall Hein, GSLT, January 2003 Direct translation no intermediary sentence structure translation proceeds in a number of steps, each step dedicated.
Statistical NLP: Lecture 3
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
Motivations for transfer-based translation lexical ambiguity structural differences See further Ingo 91.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
Machine Translation Anna Sågvall Hein Mösg F
An interactive environment for creating and validating syntactic rules Panagiotis Bouros*, Aggeliki Fotopoulou, Nicholas Glaros Institute for Language.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Creation of a Russian-English Translation Program Karen Shiells.
The LC-STAR project (IST ) Objectives: Track I (duration 2 years) Specification and creation of large word lists and lexica suited for flexible.
Machine Translation History of Machine Translation Difficulties in Machine Translation Structure of Machine Translation System Research methods for Machine.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Kalyani Patel K.S.School of Business Management,Gujarat University.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Intuitive Coding of the Arabic Lexicon Ali Farghaly & Jean Senellart SYSTRAN Software Corporation San Diego, CA & Soisy, France.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Machine translation Context-based approach Lucia Otoyo.
Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.
English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.
CIG Conference Norwich September 2006 AUTINDEX 1 AUTINDEX: Automatic Indexing and Classification of Texts Catherine Pease & Paul Schmidt IAI, Saarbrücken.
Evaluation of the Statistical Machine Translation Service for Croatian-English Marija Brkić Department of Informatics, University of Rijeka
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Profile The METIS Approach Future Work Evaluation METIS II Architecture METIS II, the continuation of the successful assessment project METIS I, is an.
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
2008 – copyright SYSTRAN SYSTRAN Challenges and Recent Advances in Hybrid Machine Translation Jean Senellart, Jin Yang, Jens Stephan
Czech-English Word Alignment Ondřej Bojar Magdalena Prokopová
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Gerrit Schutte OHIM 9th of December, 2011 Trademark terminology control.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
CS460/IT632 Natural Language Processing/Language Technology for the Web Guest Lecture (31/03/06) Prof. Niladri Chatterjee IIT Delhi Guest Lecture on Machine.
Translation Memory System (TMS)1 Translation Memory Systems Presentation by1 Melina Takanen & Julianna Ekert CAT Prof. Thorsten Trippel University.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, January 2003.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
Collocations and Terminology Vasileios Hatzivassiloglou University of Texas at Dallas.
Jan 2005CSA4050 Machine Translation II1 CSA4050: Advanced Techniques in NLP Machine Translation II Direct MT Transfer MT Interlingual MT.
LREC 2004, 26 May 2004, Lisbon 1 Multimodal Multilingual Resources in the Subtitling Process S.Piperidis, I.Demiros, P.Prokopidis, P.Vanroose, A. Hoethker,
Text segmentation Amany AlKhayat. Before any real processing is done, text needs to be segmented at least into linguistic units such as words, punctuation,
Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart
Approaching a New Language in Machine Translation Anna Sågvall Hein, Per Weijnitz.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Jan 2012MT Architectures1 Human Language Technology Machine Translation Architectures Direct MT Transfer MT Interlingual MT.
TYPES OF TRANSLATION.
Approaches to Machine Translation
Lecture – VIII Monojit Choudhury RS, CSE, IIT Kharagpur
Statistical NLP: Lecture 3
Statistical NLP: Lecture 13
--Mengxue Zhang, Qingyang Li
Eiji Aramaki* Sadao Kurohashi* * University of Tokyo
Transfer-based translation
Approaches to Machine Translation
Linguistic Essentials
Presentation transcript:

@ Anna Sågvall Hein 2005 An example of a good translation En inbyggd oljepump levererar olja under tryck både till hydraulsystemet och växellådans oljesystem.  An integrated oil pump delivers pressurised fluid both to the hydraulic system and to the lubrication system of the gearbox.

@ Anna Sågvall Hein 2005 An example of a bad translation Stackars Kalle var rädd. -> Wretched cold each cautious.

@ Anna Sågvall Hein 2005 Fundamental problems in MT lexical ambiguity in SL translation ambiguity grammatical differences between SL and TL

@ Anna Sågvall Hein 2005 Lexical ambiguity in SL form –hus (sg/pl, basic/genitive case) part-of-speech –var (verb, pronoun, adverb, noun) polysemy –fil (milk, tool, traffic lane)

@ Anna Sågvall Hein 2005 Handling form and part-of-speech ambiguity in SL Syntactic analysis of the input sentence Han köpte ett nytt hus (sg, basic case). Stackars Kalle var (verb) rädd.

@ Anna Sågvall Hein 2005 Handling polysemy domain context –rules based on grammatical analysis anta en elev (obj) –> admit a student anta, att (compl) -> suppose that examples

@ Anna Sågvall Hein 2005 Handling translation ambiguity rules based on grammatical analysis bilen på gatan -> the car on the street (lokationsattribut) taket på huset -> the roof of the house (partonymiattribut) examples

@ Anna Sågvall Hein 2005 Grammatical differences morphology –Hon köpte en liten hund. -> Sie hat einen kleinen Hund gekauft. syntax –Genom att svänga till vänster hittar du huset. -> Turning left you will find the house. word order –Sedan gick han hem. -> Then he went home.

@ Anna Sågvall Hein 2005 Handling grammatical differences syntactic-semantic analysis + transfer rules deep analysis (interlingua) + generation according to TL grammar examples

@ Anna Sågvall Hein 2005 Re-use techniques sentence alignment –linking source and target sentences pairwise –success rate close to 100 % –translation memories –basis for word alignment

@ Anna Sågvall Hein 2005 Sentence alignment I oljefilterhållaren sitter en överströmningsventil. The oil filter retainer has an overflow valve. (sventscan ) Undvik hudkontakt med kylvätska. Hudkontakt kan medföra irritation. Avoid contact with the skin as this may cause irritation. (sventscan )

@ Anna Sågvall Hein 2005 Sentence alignment, cont. Skruvarna sträcks vid varje åtdragning, därför får skruvarna i en del förband återanvändas endast ett visst antal gånger. Bolts are stretched each time they are tightened. For this reason, the bolts in some joints should only be reused a certain number of times. (sventscan )

@ Anna Sågvall Hein 2005 Re-use techniques, cont. word alignment –linking sub-sentence segments, typically, source and target words and phrases, pair-wise –co-occurrence, word similarity, dictionary –large-scale processing –success rate close to 80 % –translation dictionaries –bi- or multi-lingual term databases –data-driven machine translation

@ Anna Sågvall Hein 2005 A word alignment example Jag tar mittplatsen, som jag inte tycker om. I take the middle seat, which I dislike. jag – I tar – take mittplatsen – the middle seat som – which jag – I inte tycker om – dislike (from Tiedemann 2003)

@ Anna Sågvall Hein 2005 Evaluation of MT human –adequacy –acceptance automatic comparison with a gold standard n-gram technique: e.g. BLEU, NEVA edit distance See further (OH-presentation by Eva Forsbom)

@ Anna Sågvall Hein 2005 Automatic evaluation, ex. 1 SL: Framställningsmetod och särskild beredningsmetod: En hög kvalitet på råvaran komjölk är viktig för tillverkningen. MT: Manufacturing method and special manufacturing method: A high quality of the raw material cow's milk is important to the production. Ref: Specific production or manufacturing method: High-quality cow's milk is important to production. NEVA: 0,27

@ Anna Sågvall Hein 2005 Automatic evaluation, ex. 2 SL: Mjölkråvaran som används för ystning pastöriseras till 72 ºC i 15 sekunder. MT: The milk that is used for coagulation is pasteurised to 72 ºC for 15 seconds. Ref: The milk used for coagulation is pasteurised at 72 ºC for 15 seconds. NEVA: 0,59

@ Anna Sågvall Hein 2005 Basic translation strategies rule-based translation –direct translation –transfer-based translation –interlingua translation datadriven translation –statistical translation –example-based translation hybrids

@ Anna Sågvall Hein 2005 Direct translation translation proceeds word by word, or phrase by phrase no intermediary sentence structure the most important language component is a translation dictionary translation problems are handled more or less ad hoc by means of specific rules

@ Anna Sågvall Hein 2005 Simplistic direct approach sentence splitting tokenisation handling capital letters dictionary look-up and lexical substitution incl. heuristics for handling ambiguities copying unknown words, digits, signs of punctuation etc. formal editing

@ Anna Sågvall Hein 2005 Advanced direct approach (Tucker 1987) source text dictionary look-up and morphological analysis identification of homographs identification of compound nouns identification of nouns and verb phrases processing of idioms

@ Anna Sågvall Hein 2005 Advanced approach, cont. processing of prepositions subject-predicate identification syntactic ambiguity identification synthesis and morphological processing of TL rearrangement of words and phrases in TL

@ Anna Sågvall Hein 2005 Feasibility of direct translation quality –typically browsing quality –depends on the quality of the translation dictionary the coverage of the translation rules –editing quality may be achieved problems with –ambiguity –inflection –word order –other structural differences

@ Anna Sågvall Hein 2005 SYSTRAN SYStem TRANslation advanced direct translation (moving towards transfer-based translation) )

@ Anna Sågvall Hein 2005 EC Systran 1,600,000 dictionary units –20 domain dictionaries daily use by EC translators, administrators of the European institutions

@ Anna Sågvall Hein 2005 Ex. 1: fairly good translation "Enskilda företagare som inte bildat bolag klassificeras hit." "Individual entrepreneurs that have not formed companies are classified here.” Systemet känner igen bildat som en perfektform och översätter korrekt have formed, trots att hjälpverbet är utelämnat. Negationen not placeras på rätt plats.

@ Anna Sågvall Hein 2005 Ex. 2: word order problem/ Systran sv-en "När byarna kontaktades hade de inte ens utsatts för influensa." "When the villages were contacted had they not even been exposed to flu.” Systemet hittar inte subjekt och predikat och ger därför fel ordföljd.

@ Anna Sågvall Hein 2005 Ex. 3: ambiguity problem "Vad kan vi lära av Arrawetestammen?" "What can we faith of the Arawete?” Systemet hittar inte sambandet mellan kan och lära och ser därför inte att lära är ett verb.

@ Anna Sågvall Hein 2005 Ex. 4: ambiguity problem ”Extrapoleringen går till så här. " ”The extrapolation goes to so here.” Systemet känner inte till partikelverbet känna till och översätter därför felaktigt ord för ord.

@ Anna Sågvall Hein 2005 Transfer-based translation intermediary sentence structure provides a basis for the systematic handling of grammatical problems and some types of lexical choices basic processes –analysis –transfer –generation (synthesis)

@ Anna Sågvall Hein 2005 Transfer-based translation, cont. knowledge-intensive language modules –dictionary and grammar of SL –transfer dictionary and transfer rules –dictionary and grammar of TL

@ Anna Sågvall Hein 2005 Multra transfer-based translation engine transfer via grammatical relations –TL word order not inherited from SL modular unification-based focus on restricted domains developped at Uppsala University

@ Anna Sågvall Hein 2005 An example Sv. I oljefilterhållaren sitter en överströmningsventil.  En. The oil filter retainer has an overflow valve. (from the Scania corpus) transfer rule: sitter  has, adv  subj, subj  obj

@ Anna Sågvall Hein 2005 Interlingua translation analysis of SL sentence into a language- independent meaning representation, an interlingua –ideally, no trace of the SL structure in the interlingua generation of TL sentence from the interlingua

@ Anna Sågvall Hein 2005 Statistical machine translation translation model based on word alignment language model based on n-grams decoding algorithm –selecting the most probable combination of alternatives in the translation model and the language model

@ Anna Sågvall Hein 2005 Statistical MT on the market Language Weaver

@ Anna Sågvall Hein 2005 Example-based machine translation non-trivial use of translation examples in the translation process preliminary definition –alignment of texts –matching of input sentences against phrases (examples) –selection and extraction of equivalent TL phrases –adaptation and combination of TL phrases as acceptable output sentences (from Hutchins, J., Towards a definition of example-based machine translation. Proc. of Workshop: Example- Based Machine Translation. MT SUMMIT X. Phuket. Thailand. 2005)