1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.

Slides:



Advertisements
Similar presentations
Joint Parsing and Alignment with Weakly Synchronized Grammars David Burkett, John Blitzer, & Dan Klein TexPoint fonts used in EMF. Read the TexPoint manual.
Advertisements

Expectation Maximization Dekang Lin Department of Computing Science University of Alberta.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.
Unsupervised Dependency Parsing David Mareček Institute of Formal and Applied Linguistics Charles University in Prague Doctoral thesis defense September.
Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance July 27 EMNLP 2011 Shay B. Cohen Dipanjan Das Noah A. Smith Carnegie Mellon University.
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data John Lafferty Andrew McCallum Fernando Pereira.
Learning with Probabilistic Features for Improved Pipeline Models Razvan C. Bunescu Electrical Engineering and Computer Science Ohio University Athens,
Robust Textual Inference via Graph Matching Aria Haghighi Andrew Ng Christopher Manning.
Passage Retrieval & Re-ranking Ling573 NLP Systems and Applications May 5, 2011.
1 Learning with Latent Alignment Structures Quasi-synchronous Grammar and Tree-edit CRFs for Question Answering and Textual Entailment Mengqiu Wang Joint.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 4 March 30, 2005
Machine Translation via Dependency Transfer Philip Resnik University of Maryland DoD MURI award in collaboration with JHU: Bootstrapping Out of the Multilingual.
1 Tree-edit CRFs for RTE Mengqiu Wang and Chris Manning.
Web Logs and Question Answering Richard Sutcliffe 1, Udo Kruschwitz 2, Thomas Mandl University of Limerick, Ireland 2 - University of Essex, UK 3.
Minimum Error Rate Training in Statistical Machine Translation By: Franz Och, 2003 Presented By: Anna Tinnemore, 2006.
What is the Jeopardy Model? A Quasi-Synchronous Grammar for Question Answering Mengqiu Wang, Noah A. Smith and Teruko Mitamura Language Technology Institute.
Course Summary LING 575 Fei Xia 03/06/07. Outline Introduction to MT: 1 Major approaches –SMT: 3 –Transfer-based MT: 2 –Hybrid systems: 2 Other topics.
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
Czech-to-English Translation: MT Marathon 2009 Session Preview Jonathan Clark Greg Hanneman Language Technologies Institute Carnegie Mellon University.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( Bridging Languages for Question Answering: DIOGENE at CLEF-2003.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
Graphical models for part of speech tagging
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Advanced Signal Processing 05/06 Reinisch Bernhard Statistical Machine Translation Phrase Based Model.
Scalable Inference and Training of Context- Rich Syntactic Translation Models Michel Galley, Jonathan Graehl, Keven Knight, Daniel Marcu, Steve DeNeefe.
A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:
Natural Language Based Reformulation Resource and Web Exploitation for Question Answering Ulf Hermjakob, Abdessamad Echihabi, Daniel Marcu University of.
Training dependency parsers by jointly optimizing multiple objectives Keith HallRyan McDonaldJason Katz- BrownMichael Ringgaard.
Sequence Models With slides by me, Joshua Goodman, Fei Xia.
Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
A Systematic Exploration of the Feature Space for Relation Extraction Jing Jiang & ChengXiang Zhai Department of Computer Science University of Illinois,
Module networks Sushmita Roy BMI/CS 576 Nov 18 th & 20th, 2014.
Prototype-Driven Learning for Sequence Models Aria Haghighi and Dan Klein University of California Berkeley Slides prepared by Andrew Carlson for the Semi-
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
August 17, 2005Question Answering Passage Retrieval Using Dependency Parsing 1/28 Question Answering Passage Retrieval Using Dependency Parsing Hang Cui.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Dependence Language Model for Information Retrieval Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong Cao, Dependence Language Model for Information Retrieval,
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
John Lafferty Andrew McCallum Fernando Pereira
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
A Lightweight and High Performance Monolingual Word Aligner Xuchen Yao, Benjamin Van Durme, (Johns Hopkins) Chris Callison-Burch and Peter Clark (UPenn)
Question Classification using Support Vector Machine Dell Zhang National University of Singapore Wee Sun Lee National University of Singapore SIGIR2003.
Automatic Labeling of Multinomial Topic Models
Survey Jaehui Park Copyright  2008 by CEBT Introduction  Members Jung-Yeon Yang, Jaehui Park, Sungchan Park, Jongheum Yeon  We are interested.
The P YTHY Summarization System: Microsoft Research at DUC 2007 Kristina Toutanova, Chris Brockett, Michael Gamon, Jagadeesh Jagarlamudi, Hisami Suzuki,
Automatic Labeling of Multinomial Topic Models Qiaozhu Mei, Xuehua Shen, and ChengXiang Zhai DAIS The Database and Information Systems Laboratory.
ENHANCING CLUSTER LABELING USING WIKIPEDIA David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab SIGIR’09.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Integrating linguistic knowledge in passage retrieval for question answering J¨org Tiedemann Alfa Informatica, University of Groningen HLT/EMNLP 2005.
Question Answering Passage Retrieval Using Dependency Relations (SIGIR 2005) (National University of Singapore) Hang Cui, Renxu Sun, Keya Li, Min-Yen Kan,
Graphical Models for Segmenting and Labeling Sequence Data Manoj Kumar Chinnakotla NLP-AI Seminar.
Question Classification Ling573 NLP Systems and Applications April 25, 2013.
Approaches to Machine Translation
Prototype-Driven Learning for Sequence Models
Approaches to Machine Translation
Word embeddings based mapping
Word embeddings based mapping
Michal Rosen-Zvi University of California, Irvine
Johns Hopkins 2003 Summer Workshop on Syntax and Statistical Machine Translation Chapters 5-8 Ethan Phelps-Goodman.
Presentation transcript:

1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in entirety. the strictness of the isomorphism may vary across words or syntactic rules.  Key idea: Unlike some synchronous grammars (e.g. SCFG, which is more strict and rigid), QG defines a monolingual grammar for the target tree, “inspired” by the source tree.

2 Quasi-Synchronous Grammars  In other words, we model the generation of the target tree, influenced by the source tree (and their alignment)  QA can be thought of as extremely free monolingual translation.  The linkage between question and answer trees in QA is looser than in MT, which gives a bigger edge to QG.

3 Model  Works on labeled dependency parse trees  Learn the hidden structure (alignment between Q and A trees) by summing out ALL possible alignments  One particular alignment tells us both the syntactic configurations and the word-to-word semantic correspondences  An example… question answer parse tree question parse tree an alignment

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person is VB Q:A: $ root $ root subjwith nmod Our model makes local Markov assumptions to allow efficient computation via Dynamic Programming (details in paper) given its parent, a word is independent of all other words (including siblings).

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword is VB Q:A: $ root $ root subj root subjwith nmod

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB Q:A: $ root $ root subjobj root subjwith nmod

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT Q:A: $ root $ root subjobj det root subjwith nmod

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

11 6 types of syntactic configurations  Parent-child

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

Parent-child configuration

14 6 types of syntactic configurations  Parent-child  Same-word

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

Same-word configuration

17 6 types of syntactic configurations  Parent-child  Same-word  Grandparent-child

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

Grandparent-child configuration

20 6 types of syntactic configurations  Parent-child  Same-word  Grandparent-child  Child-parent  Siblings  C-command (Same as [D. Smith & Eisner ’06])

22 Modeling alignment  Base model

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

25 Modeling alignment cont.  Base model  Log-linear model Lexical-semantic features from WordNet, Identity, hypernym, synonym, entailment, etc.  Mixture model

26 Parameter estimation  Things to be learnt Multinomial distributions in base model Log-linear model feature weights Mixture coefficient  Training involves summing out hidden structures, thus non-convex.  Solved using conditional Expectation- Maximization

27 Experiments  Trec8-12 data set for training  Trec13 questions for development and testing

28 Candidate answer generation  For each question, we take all documents from the TREC doc pool, and extract sentences that contain at least one non-stop keywords from the question.  For computational reasons (parsing speed, etc.), we only took answer sentences <= 40 words.

29 Dataset statistics  Manually labeled 100 questions for training Total: 348 positive Q/A pairs  84 questions for dev Total: 1415 Q/A pairs 3.1+,  100 questions for testing Total: 1703 Q/A pairs 3.6+,  Automatically labeled another 2193 questions to create a noisy training set, for evaluating model robustness

30 Experiments cont.  Each question and answer sentence is tokenized, POS tagged (MX-POST), parsed (MSTParser) and labeled with named-entity tags (Identifinder)

31 Baseline systems (replications)  [Cui et al. SIGIR ‘05] The algorithm behind one of the best performing systems in TREC evaluations. It uses a mutual information-inspired score computed over dependency trees and a single fixed alignment between them.  [Punyakanok et al. NLE ’04] measures the similarity between Q and A by computing tree edit distance.  Both baselines are high-performing, syntax-based, and most straight-forward to replicate  We further enhanced the algorithms by augmenting them with WordNet.

32 Results Mean Average Precision Mean Reciprocal Rank of Top 1 Statistically significantly better than the 2 nd best score in each column 28.2% 23.9% 41.2% 30.3%

33 Summing vs. Max

34 Switching back  Tree-edit CRFs