Word embeddings based mapping

Slides:



Advertisements
Similar presentations
Deep Learning in NLP Word representation and how to use it for Parsing
Advertisements

Word/Doc2Vec for Sentiment Analysis
Distributed Representations of Sentences and Documents
What is the Best Multi-Stage Architecture for Object Recognition Kevin Jarrett, Koray Kavukcuoglu, Marc’ Aurelio Ranzato and Yann LeCun Presented by Lingbo.
Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning
Final Presentation Tong Wang. 1.Automatic Article Screening in Systematic Review 2.Compression Algorithm on Document Classification.
Qual Presentation Daniel Khashabi 1. Outline  My own line of research  Papers:  Fast Dropout training, ICML, 2013  Distributional Semantics Beyond.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.
Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel
Kai Sheng-Tai, Richard Socher, Christopher D. Manning
Deep Learning Overview Sources: workshop-tutorial-final.pdf
RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.
Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.
DeepWalk: Online Learning of Social Representations
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.
Fill-in-The-Blank Using Sum Product Network
Distributed Representations for Natural Language Processing
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
A Simple Approach for Author Profiling in MapReduce
Sentiment analysis using deep learning methods
Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images
Faster R-CNN – Concepts
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
Deep Learning for Bacteria Event Identification
A Straightforward Author Profiling Approach in MapReduce
Object Detection based on Segment Masks
Deep Learning Amin Sobhani.
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Syntax-based Deep Matching of Short Texts
Relation Extraction CSCI-GA.2591
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.
Natural Language Processing of Knee MRI Reports
Deep learning and applications to Natural language processing
Vector-Space (Distributional) Lexical Semantics
Word2Vec CS246 Junghoo “John” Cho.
Distributed Representation of Words, Sentences and Paragraphs
Convolutional Neural Networks for sentence classification
Final Presentation: Neural Network Doc Summarization
Word embeddings based mapping
The experiments based on CNN
RCNN, Fast-RCNN, Faster-RCNN
Vector Representation of Text
Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.
Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler
Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.
Word embeddings (continued)
The experiments based on word-embedding and SVM
Attention for translation
Automatic Handwriting Generation
The Updated experiment based on LSTM
Presented by: Anurag Paul
Word representations David Kauchak CS158 – Fall 2016.
Natural Language Processing Is So Difficult
Modeling IDS using hybrid intelligent systems
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Neural Machine Translation using CNN
The experiments based on Recurrent Neural Networks
Baseline Model CSV Files Pandas DataFrame Sentence Lists
Bidirectional LSTM-CRF Models for Sequence Tagging
Vector Representation of Text
Week 7 Presentation Ngoc Ta Aidean Sharghi
Anirban Laha and Vikas C. Raykar, IBM Research – India.
Neural Machine Translation by Jointly Learning to Align and Translate
Predictive Analysis by Leveraging Temporal User Behavior and User Embeddings CIKM2018 Zheng Yongli.
The experiment based on hier-attention
Presentation transcript:

Word embeddings based mapping Raymond ZHAO Wenlong (Updated on 15/08/2018 )

Word embeddings Vector space models represent words using low-fixed-dim vector Can group semantically similar words, and encode rich linguistic patterns( like word2vec (Mikolov et al., 2013) or GloVe (Pennington et al., 2014)) To apply vector model to sentence / doc, one must select an appro composition fuction

A typical NN Model A composition function g + classifier (on final representation) Unordered functions: treat input texts as bags of word embeddings Syntactic functions take word order and sentence structure into account - like NN ( CNN/RNN, g depends on a parse tree of the input sequence) Composition function is a math process for combining multiple words into a single vector Syntactic functions require more training time for huge datasets - RNN - computer a syntactic parse tree A deep unordered model Apply a composition function g to the sequence of word embeddings Vw The output is a vector z that servers as input to a logistic regression function Syntactic functions - g depend on a parse tree of the input sequence

- A deep unordered model SWEB Model - A deep unordered model By Duke University 2018, ACL Source code is on github Could obtains near state-of-the-art accuracies on sentence and document-level tasks

Paper’s result Document-level classification Dataset: Yahoo! Ans. and AG News SWEM model exhibits stronger performances, relative to both LSTM and CNN compositional architectures Marry the speed of unordered functions with the accuracy of syntactic functions Computational efficient - fewer parameters

Paper’s result Sentence-level task SWEM yields inferior accuracies Approximate 20 words on average

Simple word-embedding model SWEM-aver: take the information of each sequence into account via the addition operation ( take the info of each word) Max pooling: extract the most salient features (get the info of key words) SWEM-concat SWEM-hier: swem-aver on a local window, then a global max-pooling for each window (like n-grams)

The experiments SWEM-aver using Keras Current baseline model Use our amazon review texts (830k texts and 21k tokens) Use pre-trained Glove word embeddings (a dataset of 1B tokens) Current baseline model - multiclass logistic regression - activation =’sigmoid’ + loss = ‘categorical_crossentropy’ Current Accuracy on cpu classification: 0.6106 On Keras/Tensorflow Multi-class classification

The experiments On SWEM-aver Alg using Keras - The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-max Alg - The current accuracy on Hard-Disk: 0.3380 (to improve) * How to configure the labels is very important * objective: from the field experts - TODO

The experiments - todo Data preprocessing Try to use SWEM-con Alg Stemming Remove punctuation/stop words Try to use SWEM-con Alg Try to learn task-specific embedding Try to use topic model for short Texts

Thanks Thanks Dr. Wong