Machine Translation II How MT works Modes of use.

Slides:



Advertisements
Similar presentations
© 2000 XTRA Translation Services Is MT technology available today ready to replace human translators?
Advertisements

SPLGraph: Towards a Formalism for Software Product Lines Itay Maman IBM Research – Haifa Goetz Botterweck Lero – The Irish software Engineering Research.
5 th International Teachers Conference Singapore October 2009 Teaching Science and Languages English as a Second Language.
Copyright © 2003 Pearson Education, Inc. Slide 6-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
OvidSP Flexible. Innovative. Precise. Introducing OvidSP Resources.
OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
Improving Human-Semantic Web Interaction: The Rhizomer Experience Roberto García and Rosa Gil GRIHO - Human Computer Interaction Research Group Universitat.
1 of 18 Information Dissemination New Digital Opportunities IMARK Investing in Information for Development Information Dissemination New Digital Opportunities.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
0 - 0.
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Native Americans of North Carolina Introduction Task Process Resources Evaluation Conclusion Teacher.
Addition Facts
Automata Theory Part 1: Introduction & NFA November 2002.
Module 2 Sessions 10 & 11 Report Writing.
LABELING TURKISH NEWS STORIES WITH CRF Prof. Dr. Eşref Adalı ISTANBUL TECHNICAL UNIVERSITY COMPUTER ENGINEERING 1.
Configuration management
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Agenda Definitions Evolution of Programming Languages and Personal Computers The C Language.
TU e technische universiteit eindhoven / department of mathematics and computer science 1 Empirical Evaluation of Learning Styles Adaptation Language Natalia.
The World Wide Web. 2 The Web is an infrastructure of distributed information combined with software that uses networks as a vehicle to exchange that.
The 20th International Conference on Software Engineering and Knowledge Engineering (SEKE2008) Department of Electrical and Computer Engineering
Introduction to Computational Linguistics
Web Design Principles 5th Edition
Introduction to Computational Linguistics
Essentials for Design JavaScript Level One Michael Brooks
Addition 1’s to 20.
Week 1.
How to Use a Translation Memory Prof. Reima Al-Jarf King Saud University, Riyadh, Saudi Arabia Homepage:
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 13 Slide 1 Application architectures.
14-1 © Prentice Hall, 2004 Chapter 14: OOSAD Implementation and Operation (Adapted) Object-Oriented Systems Analysis and Design Joey F. George, Dinesh.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
A Syntactic Translation Memory Vincent Vandeghinste Centre for Computational Linguistics K.U.Leuven
Ways of classifying varieties of English Style, register, genre, …
CALTS, UNIV. OF HYDERABAD. SAP, LANGUAGE TECHNOLOGY CALTS has been in NLP for over a decade. It has participated in the following major projects: 1. NLP-TTP,
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, September 2004.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Machine Translation Anna Sågvall Hein Mösg F
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Resources Primary resources – Lexicons, structured vocabularies – Grammars (in widest sense) – Corpora – Treebanks Secondary resources – Designed for a.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Corpora and Translation Parallel corpora Statistical MT (not to mention: Corpus of translated text, for translation studies)
MACHINE TRANSLATION TRANSLATION(5) LECTURE[1-1] Eman Baghlaf.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
9/8/20151 Natural Language Processing Lecture Notes 1.
Machine Translation, Digital Libraries, and the Computing Research Laboratory Indo-US Workshop on Digital Libraries June 23, 2003.
Can Controlled Language Rules increase the value of MT? Fred Hollowood & Johann Rotourier Symantec Dublin.
Gerrit Schutte OHIM 9th of December, 2011 Trademark terminology control.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Translation Memory System (TMS)1 Translation Memory Systems Presentation by1 Melina Takanen & Julianna Ekert CAT Prof. Thorsten Trippel University.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Introduction to Computational Linguistics
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Machine Translation (Level 2) Anna Sågvall Hein GSLT Course, January 2003.
1 STO A Lexical Database of Danish for Language Technology Applications Anna Braasch Center for Sprogteknologi Copenhagen SPINN Seminar, October 27, 2001.
Development of an Intelligent Translation Memory MorphoLogic SZAK Publishers Balázs Kis
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Jan 2012MT Architectures1 Human Language Technology Machine Translation Architectures Direct MT Transfer MT Interlingual MT.
LingWear Language Technology for the Information Warrior Alex Waibel, Lori Levin Alon Lavie, Robert Frederking Carnegie Mellon University.
WP4 Models and Contents Quality Assessment
Approaches to Machine Translation
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Approaches to Machine Translation
Introduction to Machine Translation
User’s Perspective Laurie Gerber.
Presentation transcript:

Machine Translation II How MT works Modes of use

2/20 How MT works distinguish between generic “translation software” (algorithms) and language-pair- specific linguistic data –Software engineers ~ linguists Idea (from computer science) of modularity –Break down problem into manageable subproblems, essentially independent though linked to each other –Modules usually linguistically motivated linguistic formalisms for lexicons and grammars –May be more or less like formal linguistic theories –Usually “less” !

3/20 Modularity Morphological analysis Dictionary lookup Syntactic parse Attachment disambiguation Semantic roles TL syntax TL lexical choice Text normalisation TL morphology Text reconstitution Possible sequence of modules (fictitious)

4/20 Depth of analysis The “Vauquois triangle” analysis generation interlingua source texttarget text direct translation transfer

5/20 Depth of analysis The “Vauquois triangle” analysis generation interlingua source texttarget text direct translation transfer word-for-word some syntactic awareness full meaning representation

6/20 Modes of use fully automatic unrestricted texts high quality restricted input low quality impractical interactive

7/20 Different scenarios for MT Assimilation many SLs, one TL any style any topic partial analysis post-editing user is reader Dissemination one SL, many TLs controlled style single topic full analysis no post-editing user is author

8/20 Restricted input Restrictions may be natural (sublanguage) or imposed (controlled language) Related terms: special language, jargon, register, LSP For human: (usually) more readable, less ambiguous, more “focussed” For MT: –fewer syntactic constructions –closed vocabulary with fewer homonyms –greater certainty about interpretation

9/20 Features of sublanguage Lexicon –smaller size: fewer concepts to cover –finite/closed: innovation is controlled –nature: less homonymy, some synonyms (dis)favoured –grammatical use: fewer category ambiguities Syntax –reduced range of structures –some structures (dis)favoured –less flexibility in choice of structure –some deviance from “standard” grammar

10/20 Controlled languages Widely used in technical authoring Promotes consistency and readability Similar features to sublanguage Can be coupled with grammar checker Permits “multilingual authoring”

11/20 Use of low-quality output To get a rough idea of content, and to identify which parts need to be translated “properly” … especially with “exotic” languages Widely used on the Internet for browsing, chat- rooms and Despite low quality, users seem satisfied Task is especially difficult due to odd grammar, spelling, punctuation (GIGO), and wide variety of subject matter, often mixed Most MT systems now customized for web-page translation (take HTML mark-up into account)

12/20 Interactive translation Tools for translators “Translator’s workstation” Humans and computers cooperate Which takes the initiative? MAHT: human translation using translation tools HAMT: MT with human assistance

13/20 Machine-readable version of dictionary for human users

14/20 Pre-translation: terminology look-up

15/20 Translation Memory Database of previous translations More or less sophisticated matching algorithm (“fuzzy match”, simple pattern-matching which may incorporate “linguistic “knowledge”) But user must decide what to do with them

16/20 MT system’s dictionary

17/20 Bilingual concordance Source: TransSearch, Laboratoire de Recherche Appliquée en Linguistique Informatique, Université de Montréal

18/20 Parallel scrolling screens

19/20 Interactive translation

20/20 Conclusion Translation is really hard, but lay-people don’t understand this –Example: evaluating systems by use of round- trip translation, often of idioms, jokes, or set phrases Current MT systems are quite crude, and likely to remain so But useful nevertheless in appropriate scenarios, under certain conditions of use