Agreement matters: Challenges of translating into a MRL Yoav Goldberg Ben Gurion University.

Slides:



Advertisements
Similar presentations
SUBJECT-VERB AGREEMENT
Advertisements

For Language Arts Enrichment and Cross Curriculum Writing Hilary Hardin NGA LMS.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Gustar, Encantar, Fascinar, and Chocar
(It’s not that bad…). Error ID  They give you a sentence  Four sections are underlined  E is ALWAYS “No error”  Your job is to identify which one,
Statistical NLP: Lecture 3
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Syntax 3: Back to State Networks... Recursive Transition Networks John Barnden School of Computer Science University of Birmingham Natural Language Processing.
Grammar Tenses: Two-Word Verb Forms versus One-Word Verb Forms Prof. Myrna Monllor English 112.
Grammar Nuha Alwadaani.
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
Grammar Tenses: Two-Word Verb Forms versus One-Word Verb Forms Prof. Myrna Monllor English 112.
Constituency Tests Phrase Structure Rules
VERBS.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Embedded Clauses in TAG
Mrs. F B Kh Grammar is fun, isn’t it?.
Machine translation Context-based approach Lucia Otoyo.
Machine Translation Dr. Radhika Mamidi. What is Machine Translation? A sub-field of computational linguistics It investigates the use of computer software.
Grammatical Challenges for Second Language Writers Pre-Course 66 USASMA.
SET Brain pop jr. Time writing/sentence/subjectandverbagree ment/preview.weml
Haga clic para modificar el estilo de subtítulo del patrón 20/11/08 Writing and correction.
Thinking about agreement. Part of Dick Hudson's web tutorial on Word Grammarweb tutorial.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Welcome to Our Lecture: Strengthening Grammar We will begin on time. Meanwhile, enjoy chatting!
Last Minute Tips and Strategies
2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.
Pronouns Pronoun/Antecedents Who vs. Whom Pronouns as Compound Elements Shifts in Person.
SYNTAX 2 DAY 31 – NOV 08, 2013 Brain & Language LING NSCI Harry Howard Tulane University.
10/20/2015Dr. Hanaa El-Baz 1 Methodology L7 Lecture Error Correction and feedback.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Artificial Intelligence: Natural Language
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
SUBJECT-VERB AGREEMENT
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
Supertagging CMSC Natural Language Processing January 31, 2006.
Tense Review: Simple Present, Present Progressive, Simple Past
1 Syntax 1. 2 In your free time Look at the diagram again, and try to understand it. Phonetics Phonology Sounds of language Linguistics Grammar MorphologySyntax.
3 Phonology: Speech Sounds as a System No language has all the speech sounds possible in human languages; each language contains a selection of the possible.
SYNTAX 1 NOV 9, 2015 – DAY 31 Brain & Language LING NSCI Fall 2015.
Subject/Verb Agreement Rules!!! Friday, October 29, 2010.
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
English I Notes: Wednesday: You all will be taking a formal assessment that is district mandated. Friday: A quiz on subject-verb agreement that is Mr.
SUBJECT-VERB AGREEMENT. EVERY VERB MUST AGREE WITH ITS SUBJECT Singular Subject Plural Verb Plural Subject Singular Verb.
NATURAL LANGUAGE PROCESSING
A Simple English-to-Punjabi Translation System By : Shailendra Singh.
Al Oruba International Schools English Department grade 7 grammar
INTRODUCTION TO THE GRAMMAR Common Errors, Commas, and the Infinitive.
Digital Footprints By: Devon Nicholson. What is a digital footprint? A digital footprint is an online footprint in which people can look at what you have.
Composition I Spring   Subjects are always nouns or pronouns.  Nouns are people, places, things, or ideas.  Pronouns take the place of nouns:
Learning to Generate Complex Morphology for Machine Translation Einat Minkov †, Kristina Toutanova* and Hisami Suzuki* *Microsoft Research † Carnegie Mellon.
Natural Language Processing Vasile Rus
Introduction to Linguistics
Statistical NLP: Lecture 3
Natural Language Processing
James L. McClelland SS 100, May 31, 2011
Natural Language Processing
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
Syntax.
Imperative Form.
Linguistic Essentials
SUBJECT-VERB AGREEMENT
Imperative Form.
Learning Objective: SUBJECT-VERB AGREEMENT WC 1.3
Presentation transcript:

Agreement matters: Challenges of translating into a MRL Yoav Goldberg Ben Gurion University

Why should we care about syntactic modeling of MRLs?

A brief summary

What Kevin said.

Example: English  Hebrew I wash the car

Example: English  Hebrew אני רוחץ את המכונית I wash the car (I wash the car)

Example: English  Hebrew I wash the floor אני רוחץ את המכונית I wash the car (I wash the car)

Example: English  Hebrew I wash the floor אני רוחץ את המכונית I wash the car (I wash the car) אני שוטפת את הרצפה (I wash the floor)

Example: English  Hebrew I wash the floor אני רוחץ את המכוניתאני שוטפת את הרצפה I wash the car (I wash the floor) (I wash the car)

Example: English  Hebrew I wash the floor אני רוחץ את המכוניתאני שוטפת את הרצפה I wash the car (I wash the floor) (I wash the car) FeminineMasculine

Hebrew Verbs are morphologically marked for Gender (and Number, and Person, and Tense..) Hebrew Fact 1

Example: English  Hebrew I wash the floor אני רוחץ את המכוניתאני שוטפת את הרצפה I wash the car (I wash the floor) (I wash the car) FeminineMasculine Are these bad translations?

Example: English  Hebrew I wash the floor אני רוחץ את המכוניתאני שוטפת את הרצפה I wash the car (I wash the floor) (I wash the car) FeminineMasculine Are these bad translations? No. These are actually quite good. No gender information in source. Target must indicate gender.  translator uses world knowledge.

Let’s have some fun

Language Models as Social Indicators I love her I love him

Language Models as Social Indicators I love her I love him אני אוהב אותה אני אוהבת אותו

Language Models as Social Indicators I love her I love him I love meat I love vegetables אני אוהב אותה אני אוהבת אותו

Language Models as Social Indicators I love her I love him I love meat I love vegetables אני אוהב אותה אני אוהבת אותו אני אוהב בשר אני אוהבת ירקות

Language Models as Social Indicators I love her I love him I love meat I love vegetables I love to eat I love to cook אני אוהב אותה אני אוהבת אותו אני אוהב בשר אני אוהבת ירקות

Language Models as Social Indicators I love her I love him I love meat I love vegetables I love to eat I love to cook אני אוהב אותה אני אוהבת אותו אני אוהב בשר אני אוהבת ירקות אני אוהב לאכול אני אוהבת לבשל

Language Models as Social Indicators I love her I love him I love meat I love vegetables I love to eat I love to cook I love hash I love marijuana אני אוהב אותה אני אוהבת אותו אני אוהב בשר אני אוהבת ירקות אני אוהב לאכול אני אוהבת לבשל

Language Models as Social Indicators I love her I love him I love meat I love vegetables I love to eat I love to cook I love hash I love marijuana אני אוהב אותה אני אוהבת אותו אני אוהב בשר אני אוהבת ירקות אני אוהב לאכול אני אוהבת לבשל אני אוהב חשיש אני אוהבת מריחואנה

Language Models as Social Indicators I hate him. אני שונאת אותו.

Language Models as Social Indicators I hate him. I hate her. אני שונאת אותו.

Language Models as Social Indicators I hate him. I hate her. אני שונאת אותו. אני שונאת אותה.

Language Models as Social Indicators I hate him. I hate her. I hate him אני שונאת אותו. אני שונאת אותה. אני שונא אותו

Language Models as Social Indicators I hate him. I hate her. I hate him אני שונאת אותו. אני שונאת אותה. אני שונא אותו Really? A dot?! Not very stable…

Language Models as Social Indicators I hate him. I hate her. I hate him I hate her אני שונאת אותו. אני שונאת אותה. אני שונא אותו אני שונאת אותה Really? A dot?! Not very stable…

Language Models as Social Indicators I hate him. I hate her. I hate him I hate her אני שונאת אותו. אני שונאת אותה. אני שונא אותו אני שונאת אותה Really? A dot?! Not very stable… Hmm… is there a message here after all?

I love I hate Language Models as Social Indicators

I love I hate Language Models as Social Indicators אני אוהב אני שונאת

Back to Machine Translation

One English  Many Hebrew love אוהב אוהבת אוהבים אוהבות Need to acquire more knowledge … use larger parallel corpora … use dictionaries … use FSAs to model inflections Let’s assume this is solved

One English  Many Hebrew love אוהב אוהבת אוהבים אוהבות Need to acquire more knowledge … use larger parallel corpora … use dictionaries … use FSAs to model inflections Let’s assume this is solved

One English  Many Hebrew love אוהב אוהבת אוהבים אוהבות Need to acquire more knowledge … use larger parallel corpora … use dictionaries … use FSAs to model inflections Let’s assume this is solved

One English  Many Hebrew love אוהב אוהבת אוהבים אוהבות Need to acquire more knowledge … use larger parallel corpora … use dictionaries … use FSAs to model inflections Let’s assume this is solved

One English  Many Hebrew love אוהב אוהבת אוהבים אוהבות Need to acquire more knowledge … use larger parallel corpora … use dictionaries … use FSAs to model inflections Let’s assume this is solved

One English  Many Hebrew love אוהב אוהבת אוהבים אוהבות Need to acquire more knowledge … use larger parallel corpora … use dictionaries … use FSAs to model inflections Let’s assume this is solved

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide. Which form to choose? Translator must decide. HOW??

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide. Which form to choose? Translator must decide. HOW??

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide. Which form to choose? Translator must decide. HOW??

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide.

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide. Which form to choose? Translator must decide. HOW??

When translating into an MRL: Many possible word forms – Hard to acquire [but assume its solved] Need to choose correct inflection

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide. Which form to choose? Translator must decide. HOW??

One English  Many Hebrew Hebrew  English: easy אוהבת אני  I love אוהב אני  I love אוהבים אנו  We love אוהבות אנו  We love English  Hebrew: hard I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Don’t worry about it. Just say “love”. The reader will decide. Which form to choose? Translator must decide. HOW?? Just choose one at random? In the worst case we’ll insult someone..

Hebrew Verbs agree with Subject on gender and number Hebrew Fact 2

I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות Agreement dictates form

I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות singular Agreement dictates form

I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות singular plural Agreement dictates form

I love?  אני אוהב ?  אני אוהבת ?  אני אוהבים ?  אני אוהבות singular plural ✔ ✔ ✘ ✘ Agreement dictates form

The girls love?  הבחורות אוהב ?  הבחורות אוהבת ?  הבחורות אוהבים ?  הבחורות אוהבות plural / fem no missing information

The girls love?  הבחורות אוהב ?  הבחורות אוהבת ?  הבחורות אוהבים ?  הבחורות אוהבות plural / fem no missing information plural / fem sing / masc plural / masc sing / fem plural / fem

The girls love?  הבחורות אוהב ?  הבחורות אוהבת ?  הבחורות אוהבים ?  הבחורות אוהבות plural / fem ✘ no missing information ✘ ✔ ✘ plural / fem sing / masc plural / masc sing / fem plural / fem

When translating into an MRL: Many possible word forms – Hard to acquire [but assume its solved] Need to choose correct inflection Inflection is determined based on information which is external to the word

Back to Google Translate The boy washes the car The girl washes the car

Back to Google Translate The boy washes the car הנער רוחץ את המכונית The girl washes the car הילדה רוחצת את המכונית

Back to Google Translate The boy washes the car הנער רוחץ את המכונית The girl washes the car הילדה רוחצת את המכונית ✔

Back to Google Translate The boy washes the car הנער רוחץ את המכונית The girl washes the car הילדה רוחצת את המכונית Good job Franz! ✔

Back to Google Translate The boy washes the car הנער רוחץ את המכונית The girl washes the car הילדה רוחצת את המכונית Good job Franz?

Back to Google Translate The boy washes the car הנער רוחץ את המכונית The girl washes the car הילדה רוחצת את המכונית Good job Franz? childyoung-man

Back to Google Translate The boy with the sunglasses washes the floor הנער עם משקפי השמש שוטפת את הרצפה Good job Franz? The girl with the sunglasses washes the car הבחורה עם משקפי השמש שוטף את המכונית chick

Back to Google Translate The boy with the sunglasses washes the floor הנער עם משקפי השמש שוטפת את הרצפה Good job Franz? The girl with the sunglasses washes the car הבחורה עם משקפי השמש שוטף את המכונית

Back to Google Translate The boy with the sunglasses washes the floor הנער עם משקפי השמש שוטפת את הרצפה Good job Franz? The girl with the sunglasses washes the car הבחורה עם משקפי השמש שוטף את המכונית ✘ ✘

Back to Google Translate The boy with the sunglasses washes the floor הנער עם משקפי השמש שוטפת את הרצפה Good job Franz? The girl with the sunglasses washes the car הבחורה עם משקפי השמש שוטף את המכונית young-man chick ✘ ✘

What happened? Long distance agreement Can’t be represented in phrase-table Can’t be represented in n-gram LM  Local “semantic” information from LM/Phrase  Bad translation (ungrammatical)

What happened? Long distance agreement Can’t be represented in phrase-table Can’t be represented in n-gram LM  Local “semantic” information from LM/Phrase  Bad translation (ungrammatical) It’s not Franz’s fault, but the system’s

When translating into an MRL: Many possible word forms – Hard to acquire [but I assume its solved] Need to choose correct inflection Inflection is determined based on information which is external to the word and frequently far away from it

When translating into an MRL: Many possible word forms – Hard to acquire [but I assume its solved] Need to choose correct inflection Inflection is determined based on information which is external to the word and frequently far away from it Distance from Verb to Subject in the Hebrew Dependency Treebank (news domain) S-V Dep-LengthCountPercent % % % 44055% 52974% > %

When translating into an MRL: Many possible word forms – Hard to acquire [but I assume its solved] Need to choose correct inflection Inflection is determined based on information which is external to the word and frequently far away from it Distance from Verb to Subject in the Hebrew Dependency Treebank (news domain) S-V Dep-LengthCountPercent % % % 44055% 52974% > % 2 words apart is already though for reliably estimating in an n-gram based system!

When translating into an MRL: Many possible word forms – Hard to acquire [but I assume its solved] Need to choose correct inflection Inflection is determined based on information which is external to the word and frequently far away from it

When translating into an MRL: Many possible word forms – Hard to acquire [but I assume its solved] Need to choose correct inflection Inflection is determined based on information which is external to the word and frequently far away from it  Phrase based + N-gram LM can’t do it

What if both languages are MRLs? Gender/number marked on both sides No need for word-external information We can translate word  word again MRL  MRL is easy!

What if both languages are MRLs? Gender/number marked on both sides No need for word-external information We can translate word  word again MRL  MRL is easy! Wrong!

What if both languages are MRLs? Gender/number marked on both sides But: – agreement patterns differ between languages – gender information differ between languages

What if both languages are MRLs? Example: – Spanish and Hebrew have adjective-noun agreement

What if both languages are MRLs? Example: – Spanish and Hebrew have adjective-noun agreement – new shirt חולצה חדשה nueva camisa – new car מכונית חדשה nuevo automovil

What if both languages are MRLs? Example: – Spanish and Hebrew have adjective-noun agreement – new shirt חולצה חדשה nueva camisa – new car מכונית חדשה nuevo automovil

– new computer מחשב חדש nueva computadora What if both languages are MRLs? Example: – Spanish and Hebrew have adjective-noun agreement – new shirt חולצה חדשה nueva camisa – new car מכונית חדשה nuevo automovil

– new computer מחשב חדש nueva computadora What if both languages are MRLs? Example: – Spanish and Hebrew have adjective-noun agreement – new shirt חולצה חדשה nueva camisa – new car מכונית חדשה nuevo automovil חדש nuevo חדשה nueva

– new computer מחשב חדש nueva computadora What if both languages are MRLs? Example: – Spanish and Hebrew have adjective-noun agreement – new shirt חולצה חדשה nueva camisa – new car מכונית חדשה nuevo automovil חדש nuevo חדשה nueva - Many-to-many mapping - Correct form still depends on external information - More chances for error - Acquiring all the pairs from parallel corpora is harder

Must consider (at least) syntax Phrase-tables and n-grams still can’t do it

When Translating into an MRL: MT systems must be aware of gender/number Should have a notion of agreement Use syntax to enforce agreement

Source-side Syntax washes floor the The boy with the sunglasses subjobj

Source-side Syntax washes floor the The boy with the sunglasses subjobj נער ילד בחור masc/sing

Source-side Syntax washes floor the The boy with the sunglasses subjobj נער ילד בחור שוטף שוטפת שוטפים שוטפות masc/sing fem /sing masc/plural fem /plural

Source-side Syntax washes floor the The boy with the sunglasses subjobj נער ילד בחור שוטף שוטפת שוטפים שוטפות masc/sing fem /sing masc/plural fem /plural Agree!

Source-side Syntax washes floor the The boy with the sunglasses subjobj נער ילד בחור שוטף שוטפת שוטפים שוטפות masc/sing fem /sing masc/plural fem /plural Agree!

Source-side Syntax washes floor the The boy with the sunglasses subjobj נער ילד בחור שוטף שוטפת שוטפים שוטפות masc/sing fem /sing masc/plural fem /plural Problems: - How to obtain gender/number information? - How to decode efficiently? - Agreement behavior is not always that simple

Target-side Syntax xLNT transducers can model agreement

Target-side Syntax NN masc/sing ( ילד )  boy VB masc/sing ( שוטף )  washes VB fem/sing ( שוטפת )  washes NP masc/sing (x0:DT x1:NN masc/sing x2:PP)  x0 x1 x2 VP masc/sing (x0:VB masc/sing x1:NP)  x0 x1 S masc/sing (x0:NP masc/sing x1:VP masc/sing )  x0 x1 xLNT transducers can model agreement

Target-side Syntax NN masc/sing ( ילד )  boy VB masc/sing ( שוטף )  washes VB fem/sing ( שוטפת )  washes NP masc/sing (x0:DT x1:NN masc/sing x2:PP)  x0 x1 x2 VP masc/sing (x0:VB masc/sing x1:NP)  x0 x1 S masc/sing (x0:NP masc/sing x1:VP masc/sing )  x0 x1 Gender and number information encoded in the lexical rules xLNT transducers can model agreement

Target-side Syntax NN masc/sing ( ילד )  boy VB masc/sing ( שוטף )  washes VB fem/sing ( שוטפת )  washes NP masc/sing (x0:DT x1:NN masc/sing x2:PP)  x0 x1 x2 VP masc/sing (x0:VB masc/sing x1:NP)  x0 x1 S masc/sing (x0:NP masc/sing x1:VP masc/sing )  x0 x1 Gender and number information encoded in the lexical rules Agreement information encoded in the grammar xLNT transducers can model agreement

Target-side Syntax NN masc/sing ( ילד )  boy VB masc/sing ( שוטף )  washes VB fem/sing ( שוטפת )  washes NP masc/sing (x0:DT x1:NN masc/sing x2:PP)  x0 x1 x2 VP masc/sing (x0:VB masc/sing x1:NP)  x0 x1 S masc/sing (x0:NP masc/sing x1:VP masc/sing )  x0 x1 Gender and number information encoded in the lexical rules Agreement information encoded in the grammar xLNT transducers can model agreement Problems: - How to obtain gender/number information? - Grammar is going to be huge (can we make it smaller?) - How are we going to obtain the grammar? efficiently encoding morphological processes in a treebank grammar  an open research question

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better 

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better 

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better 

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better 

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better 

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better 

On The Parsing Side of Things Most work on parsing MRLs: – consider morphology to be a lexicon-level issue Many inflections  high OOV rate – Ignoring morphology at syntax-level – PCFGLA works frustratingly well Recently: – Smarter modeling of morphology at syntax level – Using morphological agreement to improve parsing  Modest benefits to parsing accuracy  PCFGLA still better  PCFGLA is not modeling agreement

Rebels without a cause? Syntax-based MT: – Neat! – Only marginally better than phrase-based

Rebels without a cause? Syntax-based MT: – Neat! – Only marginally better than phrase-based English grammaticality is relatively easy to capture using local information

Rebels without a cause? Syntax-based MT: – Neat! – Only marginally better than phrase-based Syntax-level morphology in parsing: – Neat! – Only marginally better than ignoring it English grammaticality is relatively easy to capture using local information

Rebels without a cause? Syntax-based MT: – Neat! – Only marginally better than phrase-based Syntax-level morphology in parsing: – Neat! – Only marginally better than ignoring it English grammaticality is relatively easy to capture using local information I should work harder. Not many agreement mistakes to begin with. Agreement is a generation issue more than a parsing one.

Rebels without a cause? Syntax-based MT: – Neat! – Only marginally better than phrase-based Syntax-level morphology in parsing: – Neat! – Only marginally better than ignoring it Both are crucial for translating into MRLs!

To Conclude Translating into MRLs brings new challenges Syntax is crucial – If you are not looking into syntax, you should! – If you are looking into syntax – look deeper! Plenty of interesting work to be done

To Conclude Translating into MRLs brings new challenges Syntax is crucial – If you are not looking into syntax, you should! – If you are looking into syntax – look deeper! Plenty of interesting work to be done – Finishing up my phd on parsing Looking for a postdoc position for next year