Yang-de Chen yongde0108@gmail.com Tutorial: word2vec Yang-de Chen yongde0108@gmail.com.

Slides:



Advertisements
Similar presentations
Measuring the Influence of Long Range Dependencies with Neural Network Language Models Le Hai Son, Alexandre Allauzen, Franc¸ois Yvon Univ. Paris-Sud and.
Advertisements

Deep Learning in NLP Word representation and how to use it for Parsing
Word/Doc2Vec for Sentiment Analysis
Radial Basis Functions
A simple classifier Ridge regression A variation on standard linear regression Adds a “ridge” term that has the effect of “smoothing” the weights Equivalent.
Distributed Representations of Sentences and Documents
Linguistic Regularities in Sparse and Explicit Word Representations
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
©2014 Experian Information Solutions, Inc. All rights reserved. Experian Confidential.
CS365 Course Project Billion Word Imputation Guide: Prof. Amitabha Mukherjee Group 20: Aayush Mudgal [12008] Shruti Bhargava [13671]
Rotation Invariant Neural-Network Based Face Detection
Subversion (SVN) Tutorial Source:
Producer 2003 By Mark White. Producer 2003 A add-on to PowerPoint 2003 Stand alone program Allows you to:  Create –audio and video  Edit  Synchronize.
Constructing Knowledge Graph from Unstructured Text Image Source: Kundan Kumar Siddhant Manocha.
Instance Construction via Likelihood- Based Data Squashing Madigan D., Madigan D., et. al. (Ch 12, Instance selection and Construction for Data Mining.
Efficient Estimation of Word Representations in Vector Space
Learning to Share Meaning in a Multi-Agent System (Part I) Ganesh Padmanabhan.
Omer Levy Yoav Goldberg Ido Dagan Bar-Ilan University Israel
Deep Learning for Efficient Discriminative Parsing Niranjan Balasubramanian September 2 nd, 2015 Slides based on Ronan Collobert’s Paper and video from.
Deep Visual Analogy-Making
Accurate Cross-lingual Projection between Count-based Word Vectors by Exploiting Translatable Context Pairs SHONOSUKE ISHIWATARI NOBUHIRO KAJI NAOKI YOSHINAGA.
Text Summarization via Semantic Representation 吳旻誠 2014/07/16.
Ganesh J, Soumyajit Ganguly, Manish Gupta, Vasudeva Varma, Vikram Pudi
Vector Semantics Dense Vectors.
RELATION EXTRACTION, SYMBOLIC SEMANTICS, DISTRIBUTIONAL SEMANTICS Heng Ji Oct13, 2015 Acknowledgement: distributional semantics slides from.
A Tutorial on ML Basics and Embedding Chong Ruan
Efficient Estimation of Word Representations in Vector Space By Tomas Mikolov, Kai Chen, Greg Corrado, Jeffrey Dean. Google Inc., Mountain View, CA. Published.
Spectral Algorithms for Learning HMMs and Tree HMMs for Epigenetics Data Kevin C. Chen Rutgers University joint work with Jimin Song (Rutgers/Palentir),
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation EMNLP’14 paper by Kyunghyun Cho, et al.
DeepWalk: Online Learning of Social Representations
Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.
Fill-in-The-Blank Using Sum Product Network
Distributed Representations for Natural Language Processing
An Introduction to Triple Scoring (WSDM Cup T2)
Sivan Biham & Adam Yaari
Zheng ZHANG 1-st year PhD candidate Group ILES, LIMSI
Korean version of GloVe Applying GloVe & word2vec model to Korean corpus speaker : 양희정 date :
Comparison with other Models Exploring Predictive Architectures
Deep Learning for Bacteria Event Identification
Syntax-based Deep Matching of Short Texts
Neural Machine Translation by Jointly Learning to Align and Translate
A Deep Learning Technical Paper Recommender System
Intro to NLP and Deep Learning
Zhe Ye Word2vec Tutorial Zhe Ye
Distributed Representations of Words and Phrases and their Compositionality Presenter: Haotian Xu.
Vector-Space (Distributional) Lexical Semantics
Efficient Estimation of Word Representation in Vector Space
Word2Vec CS246 Junghoo “John” Cho.
Machine Learning Today: Reading: Maria Florina Balcan
Distributed Representation of Words, Sentences and Paragraphs
Word Embeddings with Limited Memory
Compilers, Make and SubVersion
Word Embedding Word2Vec.
Creating Data Representations
Word embeddings based mapping
Word embeddings based mapping
Sadov M. A. , NRU HSE, Moscow, Russia Kutuzov A. B
Resource Recommendation for AAN
Socialized Word Embeddings
Vector Representation of Text
A connectionist model in action
Word2Vec.
Word embeddings Text processing with current NNs requires encoding into vectors. One-hot encoding: N words encoded by length N vectors. A word gets a.
Word embeddings (continued)
Word representations David Kauchak CS158 – Fall 2016.
Natural Language Processing Is So Difficult
Baseline Model CSV Files Pandas DataFrame Sentence Lists
Vector Representation of Text
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

Yang-de Chen yongde0108@gmail.com Tutorial: word2vec Yang-de Chen yongde0108@gmail.com

Download & Compile word2vec: https://code.google.com/p/word2vec/ Install subversion(svn) sudo apt-get install subversion Download word2vec svn checkout http://word2vec.googlecode.com/svn/trunk/ Compile make

CBOW and Skip-gram CBOW stands for “continuous bag-of-words” Both are networks without hidden layers. Reference: Efficient Estimation of Word Representations in Vector Space by Tomas Mikolov, et al.

Represent words as vectors Example sentence 謝謝 學長 祝 學長 研究 順利 Vocabulary [ 謝謝, 學長, 祝, 研究, 順利 ] One-hot vector of 學長 [0 1 0 0 0 ]

Example of CBOW window = 1 謝謝 學長 祝 學長 研究 順利 Input: [ 1 0 1 0 0] Target: [0 1 0 0 0] Projection Matrix × Input vector = vector(謝謝) + vector(祝) 1 4 7 2 5 8 3 6 9 1 0 1 =1 1 2 3 +0 4 5 6 +1 7 8 9

Training word2vec -train <training-data> -output <filename> -window <window-size> -cbow <0(skip-gram), 1(cbow)> -size <vector-size> -binary <0(text), 1(binary)> -iter <iteration-num> Example:

Play with word vectors distance <output-vector> - find related words word-analogy <output-vector> - analogy task, e.g. 𝑚𝑎𝑛→𝑘𝑖𝑛𝑔, 𝑤𝑜𝑚𝑎𝑛→?

Data: https://www.dropbox.com/s/tnp0wevr3u59ew8/d ata.tar.gz?dl=0

results

other Results

Analogy

analogy

Advanced Stuff – Phrase Vector Phrases You want to treat “New Zealand” as one word. If two words usually occur at the same time, we add underscore to treat them as one word. e.g. New_Zealand How to evaluate? If the score > threshold, we add an underscore. word2phrase -train <word-doc> -output <phrase-doc> -threshold 100 Reference: Distributed Representations of Words and Phrases and their Compositionality by Tomas Mikolov, et al.

Advanced Stuff – Negative Sampling Objective ( 𝑤 𝑡 , 𝑐 𝑡 ) 𝑐∈ 𝑐 𝑡 𝑙𝑜𝑔𝜎(𝑣𝑒𝑐 𝑤 𝑡 𝑇 𝑣𝑒𝑐 𝑐 ) − 𝑤 𝑡 , 𝑐 𝑡 ′ 𝑐 ′ ∈ 𝑐 𝑡 ′ 𝑙𝑜𝑔𝜎(𝑣𝑒𝑐 𝑤 𝑡 𝑇 𝑣𝑒𝑐 𝑐 ′ ) 𝑤 𝑡 :word, 𝑐 𝑡 : context, 𝑐 𝑡 ′ : random sample context 𝑃 𝑛 𝑤 = 𝑈𝑛𝑖𝑔𝑟𝑎𝑚 𝑤 0.75 𝑍