Semantic Compositionality through Recursive Matrix-Vector Spaces

Slides:

Advertisements

Similar presentations

Random Forest Predrag Radenković 3237/10

Advertisements

Polynomial Curve Fitting BITS C464/BITS F464 Navneet Goyal Department of Computer Science, BITS-Pilani, Pilani Campus, India.

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.

Salvatore giorgi Ece 8110 machine learning 5/12/2014

Parsing with Compositional Vector Grammars Socher, Bauer, Manning, NG 2013.

Deep Learning in NLP Word representation and how to use it for Parsing

Word/Doc2Vec for Sentiment Analysis

Three kinds of learning

Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin

Application of RNNs to Language Processing Andrey Malinin, Shixiang Gu CUED Division F Speech Group.

TTI's Gender Prediction System using Bootstrapping and Identical-Hierarchy Mohammad Golam Sohrab Computational Intelligence Laboratory Toyota.

Distributed Representations of Sentences and Documents

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.

Digital Camera and Computer Vision Laboratory Department of Computer Science and Information Engineering National Taiwan University, Taipei, Taiwan, R.O.C.

Bayesian Networks 4 th, December 2009 Presented by Kwak, Nam-ju The slides are based on, 2nd ed., written by Ian H. Witten & Eibe Frank. Images and Materials.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Pattern Recognition & Machine Learning Debrup Chakraborty

Machine Learning in Spoken Language Processing Lecture 21 Spoken Language Processing Prof. Andrew Rosenberg.

Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.

1 Pattern Recognition Pattern recognition is: 1. A research area in which patterns in data are found, recognized, discovered, …whatever. 2. A catchall.

Pattern Recognition with N-Tuple Systems Simon Lucas Computer Science Dept Essex University.

Levels of Image Data Representation 4.2. Traditional Image Data Structures 4.3. Hierarchical Data Structures Chapter 4 – Data structures for.

Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.

Kai Sheng-Tai, Richard Socher, Christopher D. Manning

Assignments CS fall Assignment 1 due Generate the in silico data set of 2sin(1.5x)+ N (0,1) with 100 random values of x between.

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features 王荣 14S

From Paraphrase Database to Compositional Paraphrase Model and Back John Wieting University of Illinois Joint work with Mohit Bansal, Kevin Gimpel, Karen.

Efficient Text Categorization with a Large Number of Categories Rayid Ghani KDD Project Proposal.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

Vector Semantics Dense Vectors.

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Sentiment analysis using deep learning methods

Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :

Deep Learning Amin Sobhani.

Recursive Neural Networks

Relation Extraction CSCI-GA.2591

References [1] - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11): ,

Spring Courses CSCI 5922 – Probabilistic Models (Mozer) CSCI Mind Reading Machines (Sidney D’Mello) CSCI 7000 – Human Centered Machine Learning.

Intro to NLP and Deep Learning

Enhancing User identification during Reading by Applying Content-Based Text Analysis to Eye- Movement Patterns Akram Bayat Amir Hossein Bayat Marc.

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Neural networks (3) Regularization Autoencoder

Deep learning and applications to Natural language processing

Data Mining Lecture 11.

Efficient Estimation of Word Representation in Vector Space

Word2Vec CS246 Junghoo “John” Cho.

Recursive Structure.

Distributed Representation of Words, Sentences and Paragraphs

Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui

Image Captions With Deep Learning Yulia Kogan & Ron Shiff

Word Embedding Word2Vec.

Word embeddings based mapping

Word embeddings based mapping

Advanced Artificial Intelligence Classification

Sigmoid and logistic regression

Artificial Intelligence 10. Neural Networks

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Word embeddings (continued)

Preposition error correction using Graph Convolutional Networks

Automatic Handwriting Generation

Deep Structured Scene Parsing by Learning with Image Descriptions

Huawei CBG AI Challenges

Presentation transcript:

Semantic Compositionality through Recursive Matrix-Vector Spaces Course Project of cs365a Sonu Agarwal Viveka kulharia

Goal Classifying semantic relationships such as “cause-effect” or “component- whole” between nouns Examples: "The introduction in the book is a summary of what is in the text." Component-Whole "The radiation from the atomic bomb explosion is a typical acute radiation.“ Cause-Effect

Parse Tree Image created using www.draw.io

Binary Parse Tree Image created using www.draw.io

What’s Novel ? We introduce a recursive neural network model (RNN) that learns compositional vector representations of vectors or sentences of arbitrary length or syntactic type We assign a vector and a matrix to every node in the parse tree Vector captures the inherent meaning of the word Matrix captures how the word modifies the neighboring words A representation for a longer phrase is computed in a bottom-up manner by recursively combining children words according to the syntactic structure in the parse tree

Recursive Matrix-Vector Model Image Source: http://www.socher.org/index.php/Main/SemanticCompositionalityThroughRecursiveMatrix-VectorSpaces

Training Initialize all the word vectors with pre-trained n-dimensional word-vectors Initialize matrices as , where is the identity matrix and is Gaussian noise Combining two words: 𝑋=𝐼+𝜀 𝐼 𝜀

Training We train vector representations by adding on top of each parent node a softmax classifier to predict a class distribution over sentiment or relationship classes Error function: sum of cross-entropy errors at all node, where s is the sentence and t is its tree.

Learning Model parameters: where and are the set of word vectors and word matrices. Objective Function: where is the cross entropy error and is the regularization parameter.

Classification of Semantic Relationship Image Source: reference1

Results Accuracy (calculated for the above confusion matrix) = 2094/2717 = 77.07% F1 Score = 82.51% Dataset: SemEval 2010 Task 8 Code Source: http://www.socher.org/index.php/Main/SemanticCompositionalityThroughRecursiveMatrix-VectorSpaces

Reference Semantic Compositionality through Recursive Matrix-Vector Spaces, Richard Socher, Brody Huval, Christopher D. Manning and Andrew Y. Ng. Conference on Empirical Methods in Natural Language Processing (EMNLP 2012, Oral) Composition in distributional models of semantics, J. Mitchell and M. Lapata Cognitive Science,34(2010):1388–1429 Simple customization of recursive neural networks for semantic relation classification, Kazuma Hashimoto, Makoto Miwa, Yoshimasa Tsuruoka, and Takashi Chikayama 2013 In EMNLP.