Word embeddings based mapping

Slides:

Advertisements

Similar presentations

Deep Learning in NLP Word representation and how to use it for Parsing

Advertisements

Word/Doc2Vec for Sentiment Analysis

Distributed Representations of Sentences and Documents

What is the Best Multi-Stage Architecture for Object Recognition Kevin Jarrett, Koray Kavukcuoglu, Marc’ Aurelio Ranzato and Yann LeCun Presented by Lingbo.

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Final Presentation Tong Wang. 1.Automatic Article Screening in Systematic Review 2.Compression Algorithm on Document Classification.

Qual Presentation Daniel Khashabi 1. Outline  My own line of research  Papers:  Fast Dropout training, ICML, 2013  Distributional Semantics Beyond.

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

2014 EMNLP Xinxiong Chen, Zhiyuan Liu, Maosong Sun State Key Laboratory of Intelligent Technology and Systems Tsinghua National Laboratory for Information.

Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel

Kai Sheng-Tai, Richard Socher, Christopher D. Manning

Deep Learning Overview Sources: workshop-tutorial-final.pdf

RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.

Parsing Natural Scenes and Natural Language with Recursive Neural Networks INTERNATIONAL CONFERENCE ON MACHINE LEARNING (ICML 2011) RICHARD SOCHER CLIFF.

DeepWalk: Online Learning of Social Representations

Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.

Fill-in-The-Blank Using Sum Product Network

Distributed Representations for Natural Language Processing

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

A Simple Approach for Author Profiling in MapReduce

Sentiment analysis using deep learning methods

Hybrid Deep Learning for Reflectance Confocal Microscopy Skin Images

Faster R-CNN – Concepts

Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :

IEEE BIBM 2016 Xu Min, Wanwen Zeng, Ning Chen, Ting Chen*, Rui Jiang*

CS 388: Natural Language Processing: LSTM Recurrent Neural Networks

Deep Learning for Bacteria Event Identification

A Straightforward Author Profiling Approach in MapReduce

Object Detection based on Segment Masks

Deep Learning Amin Sobhani.

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

Syntax-based Deep Matching of Short Texts

Relation Extraction CSCI-GA.2591

Giuseppe Attardi Dipartimento di Informatica Università di Pisa

Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.

Natural Language Processing of Knee MRI Reports

Deep learning and applications to Natural language processing

Vector-Space (Distributional) Lexical Semantics

Word2Vec CS246 Junghoo “John” Cho.

Distributed Representation of Words, Sentences and Paragraphs

Convolutional Neural Networks for sentence classification

Final Presentation: Neural Network Doc Summarization

Word embeddings based mapping

The experiments based on CNN

RCNN, Fast-RCNN, Faster-RCNN

Vector Representation of Text

Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.

Presentation By: Eryk Helenowski PURE Mentor: Vincent Bindschaedler

Autoencoders Supervised learning uses explicit labels/correct output in order to train a network. E.g., classification of images. Unsupervised learning.

Word embeddings (continued)

The experiments based on word-embedding and SVM

Attention for translation

Automatic Handwriting Generation

The Updated experiment based on LSTM

Presented by: Anurag Paul

Word representations David Kauchak CS158 – Fall 2016.

Natural Language Processing Is So Difficult

Modeling IDS using hybrid intelligent systems

Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.

Neural Machine Translation using CNN

The experiments based on Recurrent Neural Networks

Baseline Model CSV Files Pandas DataFrame Sentence Lists

Bidirectional LSTM-CRF Models for Sequence Tagging

Vector Representation of Text

Week 7 Presentation Ngoc Ta Aidean Sharghi

Anirban Laha and Vikas C. Raykar, IBM Research – India.

Neural Machine Translation by Jointly Learning to Align and Translate

Predictive Analysis by Leveraging Temporal User Behavior and User Embeddings CIKM2018 Zheng Yongli.

The experiment based on hier-attention

Presentation transcript:

Word embeddings based mapping Raymond ZHAO Wenlong (Updated on 15/08/2018 )

Word embeddings Vector space models represent words using low-fixed-dim vector Can group semantically similar words, and encode rich linguistic patterns( like word2vec (Mikolov et al., 2013) or GloVe (Pennington et al., 2014)) To apply vector model to sentence / doc, one must select an appro composition fuction

A typical NN Model A composition function g + classifier (on final representation) Unordered functions: treat input texts as bags of word embeddings Syntactic functions take word order and sentence structure into account - like NN ( CNN/RNN, g depends on a parse tree of the input sequence) Composition function is a math process for combining multiple words into a single vector Syntactic functions require more training time for huge datasets - RNN - computer a syntactic parse tree A deep unordered model Apply a composition function g to the sequence of word embeddings Vw The output is a vector z that servers as input to a logistic regression function Syntactic functions - g depend on a parse tree of the input sequence

- A deep unordered model SWEB Model - A deep unordered model By Duke University 2018, ACL Source code is on github Could obtains near state-of-the-art accuracies on sentence and document-level tasks

Paper’s result Document-level classification Dataset: Yahoo! Ans. and AG News SWEM model exhibits stronger performances, relative to both LSTM and CNN compositional architectures Marry the speed of unordered functions with the accuracy of syntactic functions Computational efficient - fewer parameters

Paper’s result Sentence-level task SWEM yields inferior accuracies Approximate 20 words on average

Simple word-embedding model SWEM-aver: take the information of each sequence into account via the addition operation ( take the info of each word) Max pooling: extract the most salient features (get the info of key words) SWEM-concat SWEM-hier: swem-aver on a local window, then a global max-pooling for each window (like n-grams)

The experiments SWEM-aver using Keras Current baseline model Use our amazon review texts (830k texts and 21k tokens) Use pre-trained Glove word embeddings (a dataset of 1B tokens) Current baseline model - multiclass logistic regression - activation =’sigmoid’ + loss = ‘categorical_crossentropy’ Current Accuracy on cpu classification: 0.6106 On Keras/Tensorflow Multi-class classification

The experiments On SWEM-aver Alg using Keras - The current experiments on RAM, Screen Size, Hard Disk and Graphics Coprocessor Configurator

The experiments On SWEN-max Alg - The current accuracy on Hard-Disk: 0.3380 (to improve) * How to configure the labels is very important * objective: from the field experts - TODO

The experiments - todo Data preprocessing Try to use SWEM-con Alg Stemming Remove punctuation/stop words Try to use SWEM-con Alg Try to learn task-specific embedding Try to use topic model for short Texts

Thanks Thanks Dr. Wong