Parsing with Compositional Vector Grammars Socher, Bauer, Manning, NG 2013.

Slides:



Advertisements
Similar presentations
10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 :
Advertisements

Deep Learning in NLP Word representation and how to use it for Parsing
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Understanding Natural Language
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin
Distributed Representations of Sentences and Documents
An SVM Based Voting Algorithm with Application to Parse Reranking Paper by Libin Shen and Aravind K. Joshi Presented by Amit Wolfenfeld.
SI485i : NLP Set 9 Advanced PCFGs Some slides from Chris Manning.
PARSING David Kauchak CS457 – Fall 2011 some slides adapted from Ray Mooney.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
1 Data-Driven Dependency Parsing. 2 Background: Natural Language Parsing Syntactic analysis String to (tree) structure He likes fish S NP VP NP VNPrn.
ROC 1.Medical decision making 2.Machine learning 3.Data mining research communities A technique for visualizing, organizing, selecting classifiers based.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Appendix B: An Example of Back-propagation algorithm
A search-based Chinese Word Segmentation Method ——WWW 2007 Xin-Jing Wang: IBM China Wen Liu: Huazhong Univ. China Yong Qin: IBM China.
Rotation Invariant Neural-Network Based Face Detection
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
SI485i : NLP Set 8 PCFGs and the CKY Algorithm. PCFGs We saw how CFGs can model English (sort of) Probabilistic CFGs put weights on the production rules.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
Ensemble Methods: Bagging and Boosting
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
1 CS 385 Fall 2006 Chapter 1 AI: Early History and Applications.
PARSING 2 David Kauchak CS159 – Spring 2011 some slides adapted from Ray Mooney.
Advisor : Prof. Sing Ling Lee Student : Chao Chih Wang Date :
–Exercise: construct the SLR parsing table for grammar: S->L=R, S->R L->*R L->id R->L –The grammar can have shift/reduce conflict or reduce/reduce conflict.
Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.
Evaluating Models of Computation and Storage in Human Sentence Processing Thang Luong CogACLL 2015 Tim J. O’Donnell & Noah D. Goodman.
Kai Sheng-Tai, Richard Socher, Christopher D. Manning
Supertagging CMSC Natural Language Processing January 31, 2006.
Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.
Semantic Compositionality through Recursive Matrix-Vector Spaces
NLP. Introduction to NLP Time flies like an arrow –Many parses –Some (clearly) more likely than others –Need for a probabilistic ranking method.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Mismatch String Kernals for SVM Protein Classification Christina Leslie, Eleazar Eskin, Jason Weston, William Stafford Noble Presented by Pradeep Anand.
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
N-Gram Model Formulas Word sequences Chain rule of probability Bigram approximation N-gram approximation.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
Roadmap Probabilistic CFGs –Handling ambiguity – more likely analyses –Adding probabilities Grammar Parsing: probabilistic CYK Learning probabilities:
Distributed Representations for Natural Language Processing
Neural Machine Translation
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Recursive Neural Networks
An overview of decoding techniques for LVCSR
Authorship Attribution Using Probabilistic Context-Free Grammars
Relation Extraction CSCI-GA.2591
Neural Machine Translation by Jointly Learning to Align and Translate
Neural networks (3) Regularization Autoencoder
Deep learning and applications to Natural language processing
Efficient Estimation of Word Representation in Vector Space
Probabilistic and Lexicalized Parsing
Recursive Structure.
Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui
N-Gram Model Formulas Word sequences Chain rule of probability
INTRODUCTION.
Word embeddings based mapping
Word embeddings based mapping
Classification Boundaries
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Natural Language to SQL(nl2sql)
Learning linguistic structure with simple recurrent neural networks
Parsing Unrestricted Text
Word embeddings (continued)
PRESENTATION: GROUP # 5 Roll No: 14,17,25,36,37 TOPIC: STATISTICAL PARSING AND HIDDEN MARKOV MODEL.
David Kauchak CS159 – Spring 2019
Presented by: Anurag Paul
Deep Structured Scene Parsing by Learning with Image Descriptions
Presentation transcript:

Parsing with Compositional Vector Grammars Socher, Bauer, Manning, NG 2013

Problem How can we parse a sentence and create a dense representation of it? – N-grams have obvious problems, most important is sparsity Can we resolve syntactic ambiguity with context? “They ate udon with forks” vs “They ate udon with chicken”

Standard Recursive Neural Net I like green eggs [ Vector(I)] [ Vector(like)] W Main [ Vector(I-like)] Score [ Vector(green)] [ Vector(eggs)] Classifier? W Main [ Vector((I-like)green)]

Standard Recursive Neural Net

Syntactically Untied RNN I like green eggs [ Vector(I)] [ Vector(like)] W N,V [ Vector(I-like)] Score [ Vector(green)] [ Vector(eggs)] Classifier W adj,N [ Vector(green-eggs)] First, parse lower level with PCFG N VAdj N

Syntactically Untied RNN

Examples: Composition Matrixes Notice that he initializes them with two identity matrixes (in the absence of other information we should average

Learning the Weights (for logistic) input

Tricks

Learning the Tree

Finding the Best Tree (inference) Want to find the parse tree with the max score (which is the sum all the scores of all sub trees) Too expensive to try every combination Trick: use non-RNN method to select best 200 trees (CKY algorithm). Then, beam search these trees with RNN.

Model Comparisons (WSJ Dataset) (Socher’s Model) F1 for parse labels

Analysis of Errors

Conclusions:

The model in this paper has (probably) been eclipsed by the Recursive Neural Tensor Network. Subsequent work showed this model performed better (in different situations) than the SU-RNN