Presentation is loading. Please wait.

Presentation is loading. Please wait.

Word Sense Disambiguation for Machine Translation Han-Bin Chen 2010.11.24.

Similar presentations


Presentation on theme: "Word Sense Disambiguation for Machine Translation Han-Bin Chen 2010.11.24."— Presentation transcript:

1 Word Sense Disambiguation for Machine Translation Han-Bin Chen

2 Reference Paper Cabezas and Resnik Using WSD Techniques for Lexical Selection. (Technical report) Carpuat and Wu Word Sense Disambiguation vs. Statistical Machine Translation. (ACL 2005) Carpuat and Wu Improving Statistical Machine Translation using Word Sense Disambiguation. (EMNLP 2007) Chan et al Word Sense Disambiguation Improves Statistical Machine Translation. (ACL 2007) Apidianaki Data-driven semantic analysis for multilingual WSD. (EACL 2009)

3 SMT Workflow Language model Input: source language Translation model Reordering model Bilingual CorpusMonolingual Corpus Decoder Output: target language

4 MT Research Areas Language model Input: source language Translation model Reordering model Bilingual CorpusMonolingual Corpus Decoder Output: target language Word Alignment Evaluation Metric

5 Translation Model (TM) Research in TM – Phrase extraction – Phrase filtering – Phrase augmentation – Word Sense Disambiguation (WSD)

6 Traditional WSD Target word is a single content word – Noun, verb, adjectives Classification task with predefined senses – WordNet, HowNet Modern WSD system – Not limited to local context – Linguistic information – Position-sensitive – Syntactic – Collocation A intuitive application of WSD is SMT

7 WSD in MT Wrong translations from Google Translate what is today's special ? 什 麼 是 今 天 的 特 色 ? I would like to reserve a table for three 我想保留一表三 the plane will briefly stop over in the airport 這架飛機將簡要地停留在機場

8 WSD in MT: Early Stage Whether WSD model can help SMT – Energetically debated question over the past years Implicit WSD in SMT – Local context: phrase table & language model Dedicated WSD system – Wider variety of context features – Position, sentence-level, document-level features WSD should play a role in MT Publicly available SMT system – Pharaoh by Philipp Koehn (2003~2004)

9 Small Scale Experiment (1) Marine CARPUAT and Dekai Wu, 2005 Chinese-to-English translation task Chinese lexical sample task includes 20 target Trained with state-of-the-art WSD – 37 training instances per target word (manual annotation)

10 Small Scale Experiment (2) Hard decision – Force the decoder to choose translations from glosses – Decided by language model Surprising and frustrating result – Small data, out-of-domain material, hard decision – Language model effect

11 Translation Disambiguation (1) Clara Cabezas and Philip Resnik, 2005 – Address 3 problems of the previous work Use aligned target word directly as "sense" – 4 senses for "briefly": { 短暫地, 短時間地, 簡潔地, 簡要地 } – Trained with state-of-the-art WSD – Handle "small data" and "out-of-domain" problems Soft decision – Pharoah XML markup Choose specified translations and translation model together – Handle "hard decision" problem

12 Translation Disambiguation (2) Pharaoh XML markup Experiment & Result Spanish-to-English test from Europarl test WSD: , Baseline: Not statistically significant But at least it is not a decrease

13 Toward Better Integration into SMT How to better integrate WSD into SMT? Phrase-based sense disambiguation (PSD) Key points – Phrase, not word – Integration into log-linear model: weight tuning

14 Successful Integration (1) Chan et al., 2007 Chinese-to-English translation Sense disambiguation on Chinese phrase – 1 or 2 consecutive Chinese words – Extract training examples from word-aligned corpus Add WSD features – Contextual probability of WSD – Reward probability of WSD

15 Successful Integration (2) Statistically significant improvement 將 無法 取得 更 多 援助 或 其他 讓步 Hiero: will be more aid and other concessions Hiero+WSD: will be unable to obtain more aid and other concessions

16 PSD System (1) Marine CARPUAT and Dekai Wu, 2007 WSD model for every phrase – Extract training data from phrase extraction – WSD probability as new feature Comments – Not every phrase need WSD – Technical problem (Pharaoh)

17 PSD System (2) Result: better translation on all test sets IWSLT 2006 datasetNIST 2004 test set

18 PSD System (3)

19 Recent Issue Different translations may have the same sense – 2 senses for "briefly", rather than 4 – Sense 1: { 短暫地, 短時間地 } – Sense 2: { 簡潔地, 簡要地 } Automatic sense clustering

20 Sense Clustering (1) Marianna Apidianaki, 2009 Two translations are semantically related – If they occur in similar context Translation unit (TU) as context – Bilingual sentence pair Source word "briefly" Translations – { 短暫地, 短時間地, 簡潔地, 簡要地 } – {t1, t2, t3, t4}

21 Sense Clustering (2) "briefly-t1" occurs in context {TU1, TU4, TU25, TU88…} "briefly-t2" occurs in context {TU5, TU18, TU92, TU126…} Clustering based on pairwise context similarity – Apidianaki, 2008

22 Sense Clustering (3) Experiment – English-Greek translation – 150 ambiguous English nouns Evaluation of lexical selection – Strict precision ( Exact match with answer word) – Enriched precision ( Match with the cluster of answer word) Result

23 Conclusion From WSD to PSD However, semantic is also important Future work – Semantic PSD


Download ppt "Word Sense Disambiguation for Machine Translation Han-Bin Chen 2010.11.24."

Similar presentations


Ads by Google