Jan 2012MT Architectures1 Human Language Technology Machine Translation Architectures Direct MT Transfer MT Interlingual MT.

Slides:



Advertisements
Similar presentations
Machine Translation II How MT works Modes of use.
Advertisements

CILC2011 A framework for structured knowledge extraction and representation from natural language via deep sentence analysis Stefania Costantini Niva Florio.
Anna Sågvall Hein, GSLT, January 2003 Direct translation no intermediary sentence structure translation proceeds in a number of steps, each step dedicated.
UNIT-III By Mr. M. V. Nikum (B.E.I.T). Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:-
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
C SC 620 Advanced Topics in Natural Language Processing Lecture 22 4/15.
1 Historical Developments of Translation Technology (TT) widespread use of fax machines, enabling translation services to operate internationally 1980s.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
1 Session 1 Advantages and Disadvantages of Translation Technology (TT) - Historical development of translation technology - Focus on TM and MT (Theory.
Speech Translation on a PDA By: Santan Challa Instructor Dr. Christel Kemke.
C SC 620 Advanced Topics in Natural Language Processing Lecture 20 4/8.
1/13 Parsing III Probabilistic Parsing and Conclusions.
Machine Translation Prof. Alexandros Potamianos Dept. of Electrical & Computer Engineering Technical University of Crete, Greece May 2003.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Lecture 1 Introduction: Linguistic Theory and Theories
Machine Translation History of Machine Translation Difficulties in Machine Translation Structure of Machine Translation System Research methods for Machine.
MACHINE TRANSLATION TRANSLATION(5) LECTURE[1-1] Eman Baghlaf.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Machine translation Context-based approach Lucia Otoyo.
9/8/20151 Natural Language Processing Lecture Notes 1.
Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
ICS611 Introduction to Compilers Set 1. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
(1.1) COEN 171 Programming Languages Winter 2000 Ron Danielson.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Globalisation and machine translation Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and First Order Predicate.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.
PETRA – the Personal Embedded Translation and Reading Assistant Werner Winiwarter University of Vienna InSTIL/ICALL Symposium 2004 June 17-19, 2004.
Evolution of Machine Translation: systems and use John Hutchins [ homepages/WJHutchins] [
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages.
Machine Translation  Machine translation is of one of the earliest uses of AI  Two approaches:  Traditional approach using grammars, rewrite rules,
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Introduction to Human Language Technologies Tomaž Erjavec Karl-Franzens-Universität Graz Tomaž Erjavec Lecture 1: Overview
NLP ? Natural Language is one of fundamental aspects of human behaviors. One of the final aim of human-computer communication. Provide easy interaction.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Jan 2005Machine Translation II1 Postgraduate Diploma in Translation Machine Translation II Direct MT Transfer MT Interlingual MT.
For Wednesday No reading Homework –Chapter 23, exercise 15 –Process: 1.Create 5 sentences 2.Select a language 3.Translate each sentence into that language.
1 The main topics in AI Artificial intelligence can be considered under a number of headings: –Search (includes Game Playing). –Representing Knowledge.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Jan 2004CSA3050: NLG21 CSA3050: Natural Language Generation 2 Surface Realisation Systemic Grammar Functional Unification Grammar see J&M Chapter 20.3.
Jan 2005CSA4050 Machine Translation II1 CSA4050: Advanced Techniques in NLP Machine Translation II Direct MT Transfer MT Interlingual MT.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Finite State Machines (FSM) OR Finite State Automation (FSA) - are models of the behaviors of a system or a complex object, with a limited number of defined.
October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
ICS312 Introduction to Compilers Set 23. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
23.3 Information Extraction More complicated than an IR (Information Retrieval) system. Requires a limited notion of syntax and semantics.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Introduction to Machine Translation
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
Approaches to Machine Translation
Introduction to Machine Translation
An Overview of Machine Translation
Course supervisor: Lubna Siddiqui
Different methods.
Approaches to Machine Translation
Introduction to Machine Translation
Information Retrieval
Presentation transcript:

Jan 2012MT Architectures1 Human Language Technology Machine Translation Architectures Direct MT Transfer MT Interlingual MT

Jan 2012MT Architectures2 History – Pre ALPAC 1952 – First MT Conference (MIT) 1954 – Georgetown System (word for word based) successfully translated 49 Russian sentences 1954 – 1965 – Much investment into brute force empirical approach – crude word-for-word techniques with limited reshuffling of output ALPAC (Automatic Language Processing Advisory Committee) Report concludes that research funds should be directed into more fundamental linguistic research

Jan 2012MT Architectures3 History – three eras –Operational Systems approach: SYSTRAN (eventually became the basis for babelfish) –University centres established in Grenoble (CETA), Montreal and Saarbruecken : Systems developed on the basis of linguistic and non-linguistic representations Ariane (Dependency Grammar) –TAUM METEO (Metamorphoses Grammars) –EUROTRA (multilingual intermediate representations) –ROSETTA (Landsbergen) interlingua based –BSO (Witkam) – Esperanto Data Driven Translation Systems

Jan 2012MT Architectures4 MT Methods MT Direct MT Rule-Based MT Data-Driven MT Transfer Interlingua EBMT SMT

Jan 2012MT Architectures5 Basic Architecture: Direct Translation source texttarget text Basic idea - language pair specific - no intermediate representation - pipeline architecture

Jan 2012MT Architectures6 Staged Direct MT (En/Jp)

Jan 2012MT Architectures7 Direct Translation Advantages Exploits fact that certain potential ambiguities can be left unresolved if you know language pair wall -wand/mauer – parete/muro Designers can concentrate more on special cases where languages differ. Minimal resources necessary: a cheap bilingual dictionary & rudimentary knowledge of target language suffices. Translation memories are a (successful and much used) development of this approach.

Jan 2012MT Architectures8 Direct Translation Disadvantages Computationally naive –Basic model: word-for-word translation + local reordering (e.g. to handle adj+noun order) Linguistically naive: –no analysis of internal structure of input, esp. wrt the grammatical relationships between the main parts of sentences. –no generalisation; everything on a case-by-case basis. Generally, poor translation –except in simple cases where there is lots of isomorphism between sentences.

Jan 2012MT Architectures9 Transfer Model of MT To overcome language differences, first build a more abstract representation of the input. The translation process as such (called transfer) operates upon at the level of the representation. This architecture assumes –analysis via some kind of parsing process. –synthesis via some kind of generation.

Jan 2012MT Architectures10 Basic Architecture: Transfer Model source text target text source representation target representation analysisgeneration transfer

Jan 2012MT Architectures11 Transfer Rules In General there are two kinds of transfer rule: Structural Transfer Rules: these deal with differences in the syntactic structures. Lexical Transfer Rules: these deal with cross lingual mappings at the level of words and fixed phrases.

Jan 2012MT Architectures12 Structural Transfer Rule NP s (Adj s,Noun s )  NP t (Noun t,Adj t )

Jan 2012MT Architectures13 Lexical Transfer Easy cases are based on bilingual dictionary lookup. Resolution of ambiguities may require further knowledge know  savoir know  connaître Not necessarily word for word schimmel  white horse

Jan 2012MT Architectures14 Transfer Model Degree of generalisation depends upon depth of representation: –Deeper the representation, harder it is to do analysis or generation. –Shallower the representation, the larger the transfer component. Where does ambiguity get resolved? Number of bilingual components can get large.

Jan 2012MT Architectures15 Interlingual Translation: The Vauquois Triangle source text target text interlingua analysis generation increasing depth

Jan 2012MT Architectures16 Interlingual Translation Transfer model requires different transfer rules for each language pair. Much work for multilingual system. Interlingual approach eliminates transfer altogether by creating a language independent canonical form known as an interlingua. Various logic-based schemes have been used to represent such forms. Other approaches include attribute/value matrices called feature structures.

Jan 2012MT Architectures17 Possible Feature Structure for “There was an old man gardening” eventgardening typeman agentnumbersg definitenessindef aspectprogressive tensepast

Jan 2012MT Architectures18 Ontological Issues The designer of an interlingua has a very difficult task. What is the appropriate inventory of attributes and values? Clearly, the choice has radical effects on the ability of the system to translate faithfully. For instance, to handle the muro/parete distinction, the internal/external characteristic of the wall would have to be encoded.

Jan 2012MT Architectures19 Feature Structure for “muro” wordmuro syntaxPOSclass noun type count fieldbuildings semanticstypestructural positionexternal

Jan 2012MT Architectures20 Interlingual Approach Pros and Cons Pros –Portable (avoids N 2 problem) –Because representation is normalised structural transformations are simpler to state. –Explanatory Adequacy Cons –Difficult to deal with terms on primitive level: –universals? –Must decompose and reassemble concepts –Useful information lost (paraphrase) In practice, works best in small domains.