12/18/20141 Machine Translation ICS 482 Natural Language Processing Lecture 29-2: Machine Translation Husni Al-Muhtaseb.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Natural Language and Prolog
Advertisements

Advanced Piloting Cruise Plot.
Kapitel 6. Copyright © Houghton Mifflin Company. All rights reserved.6 | 2 1. Hin and her.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Special Topics in Computer Science Advanced Topics in Information Retrieval Lecture 10: Natural Language Processing and IR. Syntax and structural disambiguation.
OLIF V2 Gr. Thurmair April OLIF April 2000 OLIF: Overview Rationale Principles Entries Descriptions Header Examples Status.
SWG Strategy (C) Copyright IBM Corp. 2006, All Rights Reserved. v1 ACITA 2011 demonstration of ongoing NLP work Dave Braines, David Mott, ETS, Hursley,
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 11: Structure and Union Types Problem Solving & Program Design.
Electronic Resources in the EUI Library
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Determine Eligibility Chapter 4. Determine Eligibility 4-2 Objectives Search for Customer on database Enter application signed date and eligibility determination.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Information Systems Today: Managing in the Digital World
ABC Technology Project
VOORBLAD.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Machine Translation II How MT works Modes of use.
© 2012 National Heart Foundation of Australia. Slide 2.
25 seconds left…...
1 Minimally Supervised Morphological Analysis by Multimodal Alignment David Yarowsky and Richard Wicentowski.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
Parsing for XML Developers Roger L. Costello 28 September 2014.
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
PSSA Preparation.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 13 Slide 1 Application architectures.
CpSc 3220 Designing a Database
1 Programming Languages (CS 550) Mini Language Interpreter Jeremy R. Johnson.
4/14/20151 REGULAR EXPRESSIONS AND AUTOMATA Lecture 3: REGULAR EXPRESSIONS AND AUTOMATA Husni Al-Muhtaseb.
5/12/20151 N-Gram: Part 1 ICS 482 Natural Language Processing Lecture 7: N-Gram: Part 1 Husni Al-Muhtaseb.
1 Lexicalized and Probabilistic Parsing – Part 1 ICS 482 Natural Language Processing Lecture 14: Lexicalized and Probabilistic Parsing – Part 1 Husni Al-Muhtaseb.
6/11/20151 Parsing Context-free grammars – Part 1 ICS 482 Natural Language Processing Lecture 12: Parsing Context-free grammars – Part 1 Husni Al-Muhtaseb.
NLP Credits and Acknowledgment These slides were adapted from presentations of the Authors of the book SPEECH and LANGUAGE PROCESSING: An Introduction.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Creation of a Russian-English Translation Program Karen Shiells.
Machine Translation History of Machine Translation Difficulties in Machine Translation Structure of Machine Translation System Research methods for Machine.
1 Natural Language Processing INTRODUCTION Husni Al-Muhtaseb Tuesday, February 20, 2007.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Semantics: Representing Meaning ICS 482 Natural Language Processing
8/19/20151 بسم الله الرحمن الرحيم ICS 482 Natural Language Processing Lecture 24: Project Ideas + Students Presentations Husni Al-Muhtaseb.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
1 N-Gram – Part 2 ICS 482 Natural Language Processing Lecture 8: N-Gram – Part 2 Husni Al-Muhtaseb.
9/11/ Morphology and Finite-state Transducers: Part 1 ICS 482: Natural Language Processing Lecture 5 Husni Al-Muhtaseb.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
10/13/20151 Representing Meaning Part 4 & Semantic Analysis ICS 482 Natural Language Processing Lecture 21: Representing Meaning Part 4 & Semantic Analysis.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
5/31/20161 Lexical Semantic + Students Presentations ICS 482 Natural Language Processing Lecture 26: Lexical Semantic + Students Presentations Husni Al-Muhtaseb.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
ICS 482: Natural language Processing Pre-introduction
1 Parts of Speech Part 1 ICS 482 Natural Language Processing Lecture 9: Parts of Speech Part 1 Husni Al-Muhtaseb.
29 November Parsing Context-free grammars Part 2 ICS 482 Natural Language Processing Lecture 13: Parsing Context-free grammars Part 2 Husni Al-Muhtaseb.
12/22/20151 Representing Meaning Part 2 ICS 482 Natural Language Processing Lecture 18: Representing Meaning Part 2 Husni Al-Muhtaseb.
1/31/20161 Lexical Semantic 2 ICS 482 Natural Language Processing Lecture 29-1: Lexical Semantic 2 Husni Al-Muhtaseb.
3/7/20161 Representing Meaning Part 3 ICS 482 Natural Language Processing Lecture 20: Representing Meaning Part 3 Husni Al-Muhtaseb.
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Approaches to Machine Translation
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Approaches to Machine Translation
Information Extraction ICS 482 Natural Language Processing
Lecture 27: Husni Al-Muhtaseb
Context-free grammars ICS 482 Natural Language Processing
Presentation transcript:

12/18/20141 Machine Translation ICS 482 Natural Language Processing Lecture 29-2: Machine Translation Husni Al-Muhtaseb

12/18/20142 بسم الله الرحمن الرحيم ICS 482 Natural Language Processing Lecture 29-2: Machine Translation Husni Al-Muhtaseb

NLP Credits and Acknowledgment These slides were adapted from presentations of the Authors of the book SPEECH and LANGUAGE PROCESSING: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition and some modifications from presentations found in the WEB by several scholars including the following

NLP Credits and Acknowledgment If your name is missing please contact me muhtaseb At Kfupm. Edu. sa

NLP Credits and Acknowledgment Husni Al-Muhtaseb James Martin Jim Martin Dan Jurafsky Sandiway Fong Song young in Paula Matuszek Mary-Angela Papalaskari Dick Crouch Tracy Kin L. Venkata Subramaniam Martin Volk Bruce R. Maxim Jan Hajič Srinath Srinivasa Simeon Ntafos Paolo Pirjanian Ricardo Vilalta Tom Lenaerts Heshaam Feili Björn Gambäck Christian Korthals Thomas G. Dietterich Devika Subramanian Duminda Wijesekera Lee McCluskey David J. Kriegman Kathleen McKeown Michael J. Ciaraldi David Finkel Min-Yen Kan Andreas Geyer-Schulz Franz J. Kurfess Tim Finin Nadjet Bouayad Kathy McCoy Hans Uszkoreit Azadeh Maghsoodi Khurshid Ahmad Staffan Larsson Robert Wilensky Feiyu Xu Jakub Piskorski Rohini Srihari Mark Sanderson Andrew Elks Marc Davis Ray Larson Jimmy Lin Marti Hearst Andrew McCallum Nick Kushmerick Mark Craven Chia-Hui Chang Diana Maynard James Allan Martha Palmer julia hirschberg Elaine Rich Christof Monz Bonnie J. Dorr Nizar Habash Massimo Poesio David Goss-Grubbs Thomas K Harris John Hutchins Alexandros Potamianos Mike Rosner Latifa Al-Sulaiti Giorgio Satta Jerry R. Hobbs Christopher Manning Hinrich Schütze Alexander Gelbukh Gina-Anne Levow Guitao Gao Qing Ma Zeynep Altan

Thursday, December 18, Today's Lecture  Machine Translation (MT) Structure of Machine Translation System A simple English to Arabic Machine Translation

Thursday, December 18, Structure of MT Systems  Generally they all have lexical, morphological, syntactic and semantic components, one for each of the two languages, for treating basic words, complex words, sentences and meanings

Thursday, December 18, Structure of MT Systems (cont.)  “transfer” component: the only one that is specialized for a particular pair of languages, which converts the most abstract source representation that can be achieved into a corresponding abstract target representation

Thursday, December 18, Structure of MT Systems (cont.)  Some systems make use of a so-called “interlingua” or intermediate language The transfer stage is divided into two steps, one translating a source sentence into the interlingua and the other translating the result of this into an abstract representation in the target language

Thursday, December 18, Machine Translation Morphological analysis Syntactic analysis Semantic Interpretation Interlingua inputanalysisgeneration Morphological synthesis Syntactic realization Lexical selection output

Thursday, December 18, Typical NLP System  NL Data-Base Query: Parsing = Question  SQL query Inference/retrieval = DBMS: SQL  table of records Generation = no-operation (just print the retrieved records)  Machine Translation Parsing = Source Language text  Representation Inference/retrieval = no-operation Generation = Representation  Target language Natural Language input parsinggeneration Internal representation Natural Language output Inference/retrieval

Thursday, December 18, Types of Machine Translation Interlingua Syntactic Parsing Semantic Analysis Sentence Planning Text Generation Source (Arabic) Target (English) Transfer Rules Direct: Statistical MT, Example-Based MT

Thursday, December 18, Transfer Grammars L1L1L2L2L3L3L4L4L1L1L2L2L3L3L4L4

Thursday, December 18, Interlingua Paradigm for MT L 1 L 2 L 3 L 4 Semantic Representation “interlingua”

Thursday, December 18, Interlingua-Based MT  Requires an Interlingua - language-neutral Knowledge Representation (KR) Philosophical debate: Is there an interlingua? FOL is not totally language neutral (predicates, functions, expressed in a language) Other near-interlinguas (Conceptual Dependency)  Requires a fully-disambiguating parser Domain model of legal objects, actions, relations  Requires a NL generator (KR  text)  Applicable only to well-defined technical domains  Produces high-quality MT in those domains

Thursday, December 18, Example-Based MT (EMBT)  Can we use previously translated text to learn how to translate new texts? Yes! But, it’s not so easy Two paradigms, statistical MT, and EBMT  Requirements: Aligned large parallel corpus of translated sentences {S source  S target } Bilingual dictionary for intra-S alignment Generalization patterns (names, numbers, dates…)

Thursday, December 18, EBMT Approaches  Simplest: Translation Memory If S new = S source in corpus, output aligned S target  Compositional EBMT If fragment of S new matches fragment of S s, output corresponding fragment of aligned S t Prefer maximal-length fragments Maximize grammatical compositionality  Via a target language grammar,  Or, via an N-gram statistical language model

Thursday, December 18, Multi-Engine Machine Translation  MT Systems have different strengths Rapidly adaptable: Statistical, example-based Good grammar: Rule-Based (linguistic) MT High precision in narrow domains: INTERLINGUA  Combine results of parallel-invoked MT Select best of multiple translations

Thursday, December 18, Our Approach: Structure of Translator  Lexical Module  Syntax Module  Transformation Module

Thursday, December 18, Lexical Module  Pre Processor Detect Proper Nouns Convert short forms (don’t  do not) Detect abbreviations like etc., mr.  Tokenizer Search Database of words and proper nouns and generate all possible interpretations of a word.

Thursday, December 18, Structure of Lexicon  Word  Category Noun, Pronoun, …  Subcategory Auxiliary Verb, Possessive Pronoun, ToPreposition, …  Sense Human, Animate, Unanimate

Thursday, December 18, Structure of Lexicon - Contd.  Form Base, First,Second, … (for Verb Form); First, Second,Third (for Person); Comparative, Superlative, … for Adjectives  Number Singular, Plural  Gender Masculine, Feminine  Object Preposition & Subject Preposition

Thursday, December 18, Structure of Lexicon - Contd.  Object Count Number of objects required with the verb  Arabic Meaning  Meaning for different forms Meaning of Adjective and Noun for different forms of Gender and Number

Thursday, December 18, English to Arabic Machine Translation  Salma came  Lexicon Salma: مفرد،.. سلمى، اسم علم، مؤنث، Came: جاء، فعل، ماض، متعادل...  Word to word: سلمى جاء  Needed Translation: جاءت سلمى  Modification Rules Exchange the positions of subject and verb If the gender is feminine the verb should be the same

Thursday, December 18, A second Example  The students are active  Lexicon The: ال Students: طلاب، اسم جنس، جمع، متعادل،.. Are: فعل، مضارع، يكون، جمع، متعادل.. Active: صفة، نشيط، متعادل،..  Word to Word: ال طلاب يكون نشيط  Needed Translation: الطلاب نشيطون  Modification Rules Insert ال with its successor Omit يكون Change نشيط to proper number (plural) and proper gender (masculine)  What about: Needed Translation: الطالبات نشيطات

Thursday, December 18, More Examples  Lena had recently added a home-theater sound system to the TV لينا قام مؤخرا اضاف منزل - مسرح صوت نظام الى التلفاز قامت لينا مؤخرا بإضافة نظام صوت مسرح - منزلي الى التلفاز.  The fans in the stand were screaming ال مشجعون في ال منصة كانوا صراخ المشجعون في المنصة كانوا يصرخون. كان المشجعون في المنصة يصرخون.

Thursday, December 18, Final Exam - Related NNLP Repeated Concepts Things you should know by now LLectures 12 – Today’s Lecture Related Material from the book FFrom Chapters 10, 12, 14, 15, 16, 21 TTake Home Quiz & Related Material SStudent Presentations Main Concepts Student Questions Your presentation YYour team project NNo Final Exam Sample

Thursday, December 18, Thank you أسأل الله أن يعيننا وإياكم وأن يوفق الجميع إلى كل خير سبحانك اللهم وبحمدك، أشهد أن لا إله إلا أنت، أستغفرك وأتوب إليك السلام عليكم ورحمة الله

Thursday, December 18, Thank you  السلام عليكم ورحمة الله