Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMSC 723 / LING 645: Intro to Computational Linguistics September 8, 2004: Dorr MT (continued), MT Evaluation Prof. Bonnie J. Dorr Dr. Christof Monz TA:

Similar presentations


Presentation on theme: "CMSC 723 / LING 645: Intro to Computational Linguistics September 8, 2004: Dorr MT (continued), MT Evaluation Prof. Bonnie J. Dorr Dr. Christof Monz TA:"— Presentation transcript:

1 CMSC 723 / LING 645: Intro to Computational Linguistics September 8, 2004: Dorr MT (continued), MT Evaluation Prof. Bonnie J. Dorr Dr. Christof Monz TA: Adam Lee

2 MT Challenges: Ambiguity  Syntactic Ambiguity I saw the man on the hill with the telescope  Lexical Ambiguity E: book S: libro, reservar  Semantic Ambiguity –Homography: ball(E) = pelota, baile(S) –Polysemy: kill(E), matar, acabar (S) –Semantic granularity esperar(S) = wait, expect, hope (E) be(E) = ser, estar(S) fish(E) = pez, pescado(S)

3 MT Challenges: Divergences  Meaning of two translationally equivalent phrases is distributed differently in the two languages  Example: –English: [RUN INTO ROOM] –Spanish: [ENTER IN ROOM RUNNING]

4 Divergence Frequency  32% of sentences in UN Spanish/English Corpus (5K)  35% of sentences in TREC El Norte Corpus (19K)  Divergence Types –Categorial (X tener hambre  X have hunger) [98%] –Conflational (X dar puñaladas a Z  X stab Z) [83%] –Structural (X entrar en Y  X enter Y)[35%] –Head Swapping (X cruzar Y nadando  X swim across Y)[8%] –Thematic (X gustar a Y  Y like X)[6%]

5 Spanish/Arabic Divergences Divergence E/E’ (Spanish) E/E’ (Arabic) Categorial be jealous when he returns have jealousy [tener celos] upon his return [ﻋﻧﺩ ﺮﺠﻭﻋﻪ] Conflational float come again go floating [ir flotando] return [ﻋﺎﺪ] Structural enter the house seek enter in the house [entrar en la casa] search for [ﺒﺣﺙ ﻋﻦ] Head Swap run in do something quickly enter running [entrar corriendo] go-quickly in doing something [ﺍﺴﺭﻉ] Thematic I have a headache my-head hurts me [me duele la cabeza] — [Arg1 [V]]  [Arg1 [MotionV] Modifier(v)] “The boat floated’’  “The boat went floating’’

6 (using narrowly defined divergence detection rules) Language Detected Human Sample Corpus Confirmed Size Size Spanish – Total 11.1% 10.5% 19K 150K Arabic – Total 31.9 12.5% 1K 28K Automatic Divergence Detection

7 Application of Divergence Detection: Bilingual Alignment for MT  Word-level alignments of bilingual texts are an integral part of MT models  Divergences present a great challenge to the alignment task  Common divergence types can be found in multiple language pairs, systematically identified, and resolved

8 The Problem: Alignment & Projection I began to eat the fish Yo empecé a comer el pescado

9 Why is this a hard problem? I run into the room Yo entro en el cuarto corriendo

10 Divergences! English: [RUN INTO ROOM] Spanish: [ENTER IN ROOM RUNNING]

11 Our Goal: Improved Alignment & Projection  Induce higher interannotator agreement rate  Increase the number of aligned words  Decrease multiple alignments

12 DUSTer Approach: Divergence Unraveling I run into the roomE: I move-in running the room E:E: Yo entro en el cuarto corriendo S:

13 Word-Level Alignment (1): Test Setup run John into room John enter room running Ex: John ran into the room → John entered the room running  Divergence Detection: Categorize English sentences into one of 5 divergence types  Divergence Correction: Apply appropriate structural transformation [E → E]

14 Word-Level Alignment (2): Testing Impact of Divergence Correction  Human align English and foreign sentence  Compare inter-annotator agreement, unaligned units, multiple alignments

15 Word-Level Alignment Results  Inter-Annotator Agreement: – English-Spanish: agreement increased from 80.2% to 82.9% – English-Arabic: agreement increased from 69.7% to 75.1%  Number of aligned words: – English-Spanish: aligned words increased from 82.8% to 86% – English-Arabic: aligned words increased from 61.5% to 88.1%  Multiple Alignments: – English-Spanish: number of links went from 1.35 to 1.16 – English-Arabic: number of links increased from 1.48 to 1.72

16 Divergence Unraveling Conclusions  Divergence handling shows promise for improvement of automatic alignment  Conservative lower bound on divergence frequency  Effective solution: syntactic transformation of English  Validity of solution shown through alignment experiments

17 How do we evaluate MT?  Human-based Metrics –Semantic Invariance –Pragmatic Invariance –Lexical Invariance –Structural Invariance –Spatial Invariance –Fluency –Accuracy –“Do you get it?”  Automatic Metrics: Bleu

18 BiLingual Evaluation Understudy (BLEU —Papineni, 2001)  Automatic Technique, but ….  Requires the pre-existence of Human (Reference) Translations  Approach: –Produce corpus of high-quality human translations –Judge “closeness” numerically (word-error rate) –Compare n-gram matches between candidate translation and 1 or more reference translations http://www.research.ibm.com/people/k/kishore/RC22176.pdf

19 Bleu Comparison Chinese-English Translation Example: Candidate 1: It is a guide to action which ensures that the military always obeys the commands of the party. Candidate 2: It is to insure the troops forever hearing the activity guidebook that party direct. Reference 1: It is a guide to action that ensures that the military will forever heed Party commands. Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party. Reference 3: It is the practical guide for the army always to heed the directions of the party.

20 How Do We Compute Bleu Scores?  Key Idea: A reference word should be considered exhausted after a matching candidate word is identified. For each word compute: (1) candidate word count (2) maximum ref count Add counts for each candidate word using the lower of the two numbers. Divide by number of candidate words..

21 Modified Unigram Precision: Candidate #1 Reference 1: It is a guide to action that ensures that the military will forever heed Party commands. Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party. Reference 3: It is the practical guide for the army always to heed the directions of the party. It(1) is(1) a(1) guide(1) to(1) action(1) which(1) ensures(1) that(2) the(4) military(1) always(1) obeys(0) the commands(1) of(1) the party(1) What’s the answer?????? 17/1 8

22 Modified Unigram Precision: Candidate #2 It(1) is(1) to(1) insure(0) the(4) troops(0) forever(1) hearing(0) the activity(0) guidebook(0) that(2) party(1) direct(0) What’s the answer?????? 8/1 4 Reference 1: It is a guide to action that ensures that the military will forever heed Party commands. Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party. Reference 3: It is the practical guide for the army always to heed the directions of the party.

23 Modified Bigram Precision: Candidate #1 It is(1) is a(1) a guide(1) guide to(1) to action(1) action which(0) which ensures(0) ensures that(1) that the(1) the military(1) military always(0) always obeys(0) obeys the(0) the commands(0) commands of(0) of the(1) the party(1) What’s the answer?????? 10/1 7 Reference 1: It is a guide to action that ensures that the military will forever heed Party commands. Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party. Reference 3: It is the practical guide for the army always to heed the directions of the party.

24 Modified Bigram Precision: Candidate #2 Reference 1: It is a guide to action that ensures that the military will forever heed Party commands. Reference 2: It is the guiding principle which guarantees the military forces always being under the command of the Party. Reference 3: It is the practical guide for the army always to heed the directions of the party. It is(1) is to(0) to insure(0) insure the(0) the troops(0) troops forever(0) forever hearing(0) hearing the(0) the activity(0) activity guidebook(0) guidebook that(0) that party(0) party direct(0) What’s the answer?????? 1/1 3

25 Catching Cheaters Reference 1: The cat is on the mat Reference 2: There is a cat on the mat the(2) the the the(0) the(0) the(0) the(0) What’s the unigram answer? 2/7 What’s the bigram answer?0/7


Download ppt "CMSC 723 / LING 645: Intro to Computational Linguistics September 8, 2004: Dorr MT (continued), MT Evaluation Prof. Bonnie J. Dorr Dr. Christof Monz TA:"

Similar presentations


Ads by Google