1 Hitting The Right Paraphrases In Good Time Stanley Kok Dept. of Comp. Sci. & Eng. Univ. of Washington Seattle, USA Chris Brockett NLP Group Microsoft.

Slides:

Advertisements

Similar presentations

The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China

Advertisements

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser Institute for Natural Language Processing University of Stuttgart

Statistical Machine Translation Part II: Word Alignments and EM Alexander Fraser ICL, U. Heidelberg CIS, LMU München Statistical Machine Translation.

Statistical Machine Translation Part II – Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.

DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling.

A Maximum Coherence Model for Dictionary-based Cross-language Information Retrieval Yi Liu, Rong Jin, Joyce Y. Chai Dept. of Computer Science and Engineering.

Using Web Queries for Learner Error Detection Michael Gamon, Microsoft Research Claudia Leacock, Butler-Hill Group.

IR Challenges and Language Modeling. IR Achievements Search engines  Meta-search  Cross-lingual search  Factoid question answering  Filtering Statistical.

Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.

Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.

Statistical Phrase-Based Translation Authors: Koehn, Och, Marcu Presented by Albert Bertram Titles, charts, graphs, figures and tables were extracted from.

1 Fast Incremental Proximity Search in Large Graphs Purnamrita Sarkar Andrew W. Moore Amit Prakash.

Web Projections Learning from Contextual Subgraphs of the Web Jure Leskovec, CMU Susan Dumais, MSR Eric Horvitz, MSR.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 6 May 7, 2006

1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.

LEARNING WORD TRANSLATIONS Does syntactic context fare better than positional context? NCLT/CNGL Internal Workshop Ankit Kumar Srivastava 24 July 2008.

Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.

Query session guided multi- document summarization THESIS PRESENTATION BY TAL BAUMEL ADVISOR: PROF. MICHAEL ELHADAD.

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 18– Training and Decoding in SMT System) Kushal Ladha M.Tech Student CSE Dept.,

Finding parallel texts on the web using cross-language information retrieval Achim Ruopp Joint work with Fei Xia University of Washington.

Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.

English-Persian SMT Reza Saeedi 1 WTLAB Wednesday, May 25, 2011.

Query Rewriting Using Monolingual Statistical Machine Translation Stefan Riezler Yi Liu Google 2010 Association for Computational Linguistics.

A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,

An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.

Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.

Kyoshiro SUGIYAMA, AHC-Lab., NAIST An Investigation of Machine Translation Evaluation Metrics in Cross-lingual Question Answering Kyoshiro Sugiyama, Masahiro.

1 TURKOISE: a Mechanical Turk-based Tailor-made Metric for Spoken Language Translation Systems in the Medical Domain Workshop on Automatic and Manual Metrics.

2010 Failures in Czech-English Phrase-Based MT 2010 Failures in Czech-English Phrase-Based MT Full text, acknowledgement and the list of references in.

1 Towards Automated Related Work Summarization (ReWoS) HOANG Cong Duy Vu 03/12/2010.

1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.

Unsupervised Constraint Driven Learning for Transliteration Discovery M. Chang, D. Goldwasser, D. Roth, and Y. Tu.

1 Gholamreza Haffari Simon Fraser University PhD Seminar, August 2009 Machine Learning approaches for dealing with Limited Bilingual Data in SMT.

Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,

Using Surface Syntactic Parser & Deviation from Randomness Jean-Pierre Chevallet IPAL I2R Gilles Sérasset CLIPS IMAG.

Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alexander Fraser Institute for Natural Language Processing Universität Stuttgart.

For Wednesday No reading Homework –Chapter 23, exercise 15 –Process: 1.Create 5 sentences 2.Select a language 3.Translate each sentence into that language.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

MACHINE TRANSLATION PAPER 1 Daniel Montalvo, Chrysanthia Cheung-Lau, Jonny Wang CS159 Spring 2011.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

2003 (c) University of Pennsylvania1 Better MT Using Parallel Dependency Trees Yuan Ding University of Pennsylvania.

1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.

A New Approach for English- Chinese Named Entity Alignment Donghui Feng Yayuan Lv Ming Zhou USC MSR Asia EMNLP-04.

Natural Language Generation with Tree Conditional Random Fields Wei Lu, Hwee Tou Ng, Wee Sun Lee Singapore-MIT Alliance National University of Singapore.

Learning a Monolingual Language Model from a Multilingual Text Database Rayid Ghani & Rosie Jones School of Computer Science Carnegie Mellon University.

Statistical Machine Translation Part II: Word Alignments and EM Alex Fraser Institute for Natural Language Processing University of Stuttgart

Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement Ariadna Font Llitjós, Katharina Probst, Jaime Carbonell Language Technologies.

Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.

Discriminative Modeling extraction Sets for Machine Translation Author John DeNero and Dan KleinUC Berkeley Presenter Justin Chiu.

Improving the Classification of Unknown Documents by Concept Graph Morteza Mohagheghi Reza Soltanpour

3.3 A More Detailed Look At Transformations Inversion (revised): Move Infl to C. Do Insertion: Insert interrogative do into an empty.

A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.

1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.

NETWORK FLOWS Shruti Aggrawal Preeti Palkar. Requirements 1.Implement the Ford-Fulkerson algorithm for computing network flow in bipartite graphs. 2.For.

A Syntax-Driven Bracketing Model for Phrase-Based Translation Deyi Xiong, et al. ACL 2009.

Review: Review: Translating without in-domain corpus: Machine translation post-editing with online learning techniques Antonio L. Lagarda, Daniel Ortiz-Martínez,

Discriminative Word Alignment with Conditional Random Fields Phil Blunsom & Trevor Cohn [ACL2006] Eiji ARAMAKI.

LING 575 Lecture 5 Kristina Toutanova MSR & UW April 27, 2010 With materials borrowed from Philip Koehn, Chris Quirk, David Chiang, Dekai Wu, Aria Haghighi.

English-Hindi Neural machine translation and parallel corpus generation EKANSH GUPTA ROHIT GUPTA.

Semi-Automatic Learning of Transfer Rules for Machine Translation of Minority Languages Katharina Probst Language Technologies Institute Carnegie Mellon.

Dept. Computer Science, Korea Univ. Intelligent Information System Lab A I (Artificial Intelligence) Professor I. J. Chung.

Statistical Machine Translation Part II: Word Alignments and EM

Suggestions for Class Projects

--Mengxue Zhang, Qingyang Li

Statistical Machine Translation Part III – Phrase-based SMT / Decoding

Yuri Pettinicchi Jeny Tony Philip

Statistical Machine Translation Part VI – Phrase-based Decoding

Embedding based entity summarization

Presentation transcript:

1 Hitting The Right Paraphrases In Good Time Stanley Kok Dept. of Comp. Sci. & Eng. Univ. of Washington Seattle, USA Chris Brockett NLP Group Microsoft Research Redmond, USA

Motivation Background Hitting Time Paraphraser Experiments Future Work 2 Overview

Motivation Background Hitting Time Paraphraser Experiments Future Work 3 Overview

4 What’s a paraphrase of… Paraphrase System “is on good terms with” “is friendly with” “is a friend of” … Query expansion Document summarization Natural language generation Question answering etc. Applications

5 What’s a paraphrase of… Paraphrase System “is on good terms with” “is friendly with” “is a friend of” … Bilingual Parallel Corpora

English Phrase (E) German Phrase (G) P(G|E)P(E|G) under controlunter kontrolle in checkunter kontrolle ……… 6 Bilingual Parallel Corpus …the cost dynamic is under control… …die kostenentwicklung unter kontrolle… …keep the cost in check… …die kosten unter kontrolle… … … Phrase Table

BCB system [Bannard & Callison-Burch, ACL’05] P(E 2 |E 1 ) ¼  C  G P(E 2 |G) P(G|E 1 ) SBP system [Callison-Burch, EMNLP’08] P(E 2 |E 1 ) ¼  C  G P(E 2 |G,syn(E 1 )) p(G|E 1, syn(E 1 )) 7 State of the Art

8 E1E1 E2E2 G1G1 F2F2 P(F 2 |E 1 ) P(E 2 |F 2) P(G 1 |E 1 ) P(E 2 |G 1 ) E3E3 E4E4 (in check)(under control) G2G2 G3G3 (unter kontrolle) F1F1 Graphical View

9 Path lengths > 2 General graph Add nodes to represent domain knowledge Random Walks Hitting Times G1G1 F2F2 G2G2 G3G3 F1F1 E1E1 E2E2 E3E3 E4E4

Motivation Background Hitting Time Paraphraser Experiments Future Work 10 Overview

AA Random Walk Begin at node A Randomly pick neighbor n E F D B C 11

Random Walk Begin at node A Randomly pick neighbor n Move to node n E F DA 2B C 12

Random Walk Begin at node A Randomly pick neighbor n Move to node n Repeat E F DA B 2 C 13

Expected number of steps starting from node i before node j is visited for first time Smaller hitting time → closer to start node i Truncated Hitting Time [Sarkar & Moore, UAI’07] Random walks are limited to T steps Computed efficiently & with high probability by sampling random walks [Sarkar, Moore & Prakash ICML’08] 14 Hitting Time from node i to j

Finding Truncated Hitting Time By Sampling E F D1 B C A A T=5 15

Finding Truncated Hitting Time By Sampling E F 4A B C D A D T=5 16

Finding Truncated Hitting Time By Sampling 5 F DA B C E A D E T=5 17

Finding Truncated Hitting Time By Sampling E F 4A B C D A D E D T=5 18

Finding Truncated Hitting Time By Sampling E 6 DA B CF A D E D F T=5 19

Finding Truncated Hitting Time By Sampling 5 F DA B C E A D E D F E T=5 20

Finding Truncated Hitting Time By Sampling A D E D F E T=5 E F DA B C h AD =1 h AE =2 h AF =4 h AA =0 h AB =5 h AC =5 21

Motivation Background Hitting Time Paraphraser Experiments Future Work 22 Overview

23 Hitting Time Paraphraser (HTP) Paraphrase System “is on good terms with” “is friendly with” “is a friend of” … HTP Phrase Tables English-German English-French German-French etc. PhraseParaphrases

24 Graph Construction

25 Graph Construction

BFS from query phrase up to depth d or up to max. number n of nodes d = 6, n = 50, … … … … … … … … … Graph Construction

27 Graph Construction … … … … … … … … …

28 Graph Construction … … … … … … … … … 0.6

29 Graph Construction … … … … … … … … … 0.5

Run m truncated random walks to estimate truncated hitting time of each node T = 10, m = 1,000,000 Prune nodes with hitting times = T Estimate Trunc. Hitting Times

31 Add Ngram Nodes “achieve the goal”“achieve the aim”“reach the objective” “the” …… “achieve the”“the aim”“reach”“objective”

32 Add “Syntax” Nodes “whose goal is”“the aim is”“the objective is” “what goal” start with articleend with bestart with interrogatives

33 Add Not-Substring-Of Nodes “reach the”“reach the aim”“reach the objective”“objective” not-substring-of

34 Feature Nodes ngram nodes “syntax” nodes not-substring nodes phrase nodes p2p2 p1p1 p3p3 p4p4 = 0.4 = 0.1 = 0.4 = 0.1

Run m truncated random walks again Rank paraphrases in increasing order of hitting times 35 Re-estimate Truncated Hitting Times

Motivation Background Hitting Time Paraphraser Experiments Future Work 36 Overview

Europarl dataset [Koehn, MT-Summit’05] Use 6 of 11 languages: English, Danish, German, Spanish, Finnish, Dutch About a million sentences per language English−Foreign phrasal alignments by giza++ [Callison-Burch, EMNLP’08] Foreign−Foreign phrasal alignments by MSR aligner 37 Data

SBP system [Callison-Burch, EMNLP’08] HTP with no feature node HTP with bipartite graph 38 Comparison Systems

NIST dataset 4 English translations per Chinese sentence 33,216 English translations Randomly selected 100 English phrases From 1-4grams in both NIST & Europarl datasets Exclude stop words, numbers, phrases containing periods and commas 39 Evaluation Methodology

For each phrase, randomly select a sentence from NIST dataset containing it Substituted top 1 to 10 paraphrases for phrase 40 Methodology

Manually evaluated resulting sentences 0: Clearly wrong; grammatically incorrect or does not preserve meaning 1: Minor grammatical errors (e.g., subject-verb disagreement; wrong tenses, etc.), or meaning largely preserved but not completely 2: Totally correct; grammatically correct and meaning is preserved Correct: 1 and 2; Wrong: 0 Two evaluators; Kappa = 0.62 (substantial agree.) 41 Methodology

42 Phr. HTPSBP q1q1 q2q2 ……… q 49 q 50 q 51 …… q 100 HTP vs. SBP p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 8 1 p 9 1 p 10 1 p 11 1 p 12 1 p 1 2 p 2 2 p 3 2 p 4 2 p 5 2 p 1 49 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 1 2 p 2 2 p 3 2 p 1 p 2 p 3 p 4 p 5 p 1 50 p 2 p 3 p 4 p 5 p 6 p 7 p 1 51 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9 51 p p p

43 Phr. HTPSBP q1q1 q2q2 ……… q 49 q 50 q 51 …… q 100 HTP vs. SBP p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 8 1 p 9 1 p 10 1 p 11 1 p 12 1 p 1 2 p 2 2 p 3 2 p 4 2 p 5 2 p 1 49 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 1 2 p 2 2 p 3 2 p 1 p 2 p 3 p 4 p 5 p 1 50 p 2 p 3 p 4 p 5 p 6 p 7 p 1 51 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9 p p p paraphrases per system p p 2 p 3 p 4 p 5 p 6 p 7 p 8

44 Phr. HTPSBP q1q1 q2q2 ……… q 49 q 50 q 51 …… q 100 HTP vs. SBP p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 8 1 p 9 1 p 10 1 p 11 1 p 12 1 p 1 2 p 2 2 p 3 2 p 4 2 p 5 2 p 1 49 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 1 2 p 2 2 p 3 2 p 1 p 2 p 3 p 4 p 5 p 1 50 p 2 p 3 p 4 p 5 p 6 p 7 p 1 51 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9 p p p paraphrases 0.54 p p 2 p 3 p 4 p 5 p 6 p 7 p 8

45 Phr. HTPSBP q1q1 q2q2 ……… q 49 q 50 q 51 …… q 100 HTP vs. SBP p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 8 1 p 9 1 p 10 1 p 11 1 p 12 1 p 1 2 p 2 2 p 3 2 p 4 2 p 5 2 p 1 49 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 1 2 p 2 2 p 3 2 p 1 p 2 p 3 p 4 p 5 p 1 50 p 2 p 3 p 4 p 5 p 6 p 7 p 1 51 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9 p p p p p 2 p 3 p 4 p 5 p 6 p 7 p

46 Phr. HTPSBP q1q1 q2q2 ……… q 49 q 50 q 51 …… q 100 HTP vs. SBP p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 8 1 p 9 1 p 10 1 p 11 1 p 12 1 p 1 2 p 2 2 p 3 2 p 4 2 p 5 2 p 1 49 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 1 1 p 2 1 p 3 1 p 4 1 p 5 1 p 6 1 p 7 1 p 1 2 p 2 2 p 3 2 p 1 p 2 p 3 p 4 p 5 p 1 50 p 2 p 3 p 4 p 5 p 6 p 7 p 1 51 p 2 p 3 p 4 p 5 p 6 p 7 p 8 p 9 p p p p p 2 p 3 p 4 p 5 p 6 p 7 p paraphrases paraphrases 492 paraphrases correct paraphrases 145 correct paraphrases

47 Timings SystemTiming (secs/phrase) HTP48 SBP468

Motivation Background Hitting Time Paraphraser Experiments Future Work 48 Overview

Apply HTP to languages other than English Evaluate HTP impact on applications e.g., improve performance of resource-sparse machine translation systems Add more features etc. 49 Future Work

HTP: a paraphrase system based on random walks Good paraphrases have smaller hitting times General graph Path length > 2 Incorporate domain knowledge HTP outperforms state-of-the-art 50 Conclusion