Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.

Slides:

Advertisements

Similar presentations

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki

Advertisements

Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.

Proceedings of the Conference on Intelligent Text Processing and Computational Linguistics (CICLing-2007) Learning for Semantic Parsing Advisor: Hsin-His.

SEARCHING QUESTION AND ANSWER ARCHIVES Dr. Jiwoon Jeon Presented by CHARANYA VENKATESH KUMAR.

Text Similarity David Kauchak CS457 Fall 2011.

Sequence Clustering and Labeling for Unsupervised Query Intent Discovery Speaker: Po-Hsien Shih Advisor: Jia-Ling Koh Source: WSDM’12 Date: 1 November,

DOMAIN DEPENDENT QUERY REFORMULATION FOR WEB SEARCH Date : 2013/06/17 Author : Van Dang, Giridhar Kumaran, Adam Troy Source : CIKM’12 Advisor : Dr. Jia-Ling.

Sentiment Analysis An Overview of Concepts and Selected Techniques.

47 th Annual Meeting of the Association for Computational Linguistics and 4 th International Joint Conference on Natural Language Processing Of the AFNLP.

Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.

Predicting Text Quality for Scientific Articles AAAI/SIGART-11 Doctoral Consortium Annie Louis : Louis A. and Nenkova A Automatically.

Xyleme A Dynamic Warehouse for XML Data of the Web.

April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:

Information retrieval Finding relevant data using irrelevant keys Example: database of photographic images sorted by number, date. DBMS: Well structured.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Huimin Ye.

Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.

Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.

(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence

Richard Socher Cliff Chiung-Yu Lin Andrew Y. Ng Christopher D. Manning

Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.

Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.

COMPUTER-ASSISTED PLAGIARISM DETECTION PRESENTER: CSCI 6530 STUDENT.

A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.

PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.

Exploring Online Social Activities for Adaptive Search Personalization CIKM’10 Advisor ： Jia Ling, Koh Speaker ： SHENG HONG, CHUNG.

1 Efficient Search Ranking in Social Network ACM CIKM2007 Monique V. Vieira, Bruno M. Fonseca, Rodrigo Damazio, Paulo B. Golgher, Davi de Castro Reis,

Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.

A Probabilistic Graphical Model for Joint Answer Ranking in Question Answering Jeongwoo Ko, Luo Si, Eric Nyberg (SIGIR ’ 07) Speaker: Cho, Chin Wei Advisor:

Date: 2014/02/25 Author: Aliaksei Severyn, Massimo Nicosia, Aleessandro Moschitti Source: CIKM’13 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Building.

Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.

Adding Semantics to Clustering Hua Li, Dou Shen, Benyu Zhang, Zheng Chen, Qiang Yang Microsoft Research Asia, Beijing, P.R.China Department of Computer.

Approaches to Machine Translation CSC 5930 Machine Translation Fall 2012 Dr. Tom Way.

Date : 2014/01/14 Author : Thanh-Son Nguyen, Hady W. Lauw, Panayiotis Tsaparas Source : CIKM’13 Advisor : Jia-ling Koh Speaker : Shao-Chun Peng.

Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.

1 Intelligente Analyse- und Informationssysteme Frank Reichartz, Hannes Korte & Gerhard Paass Fraunhofer IAIS, Sankt Augustin, Germany Dependency Tree.

Artificial Intelligence Research Center Pereslavl-Zalessky, Russia Program Systems Institute, RAS.

Date : 2013/03/18 Author : Jeffrey Pound, Alexander K. Hudek, Ihab F. Ilyas, Grant Weddell Source : CIKM’12 Speaker : Er-Gang Liu Advisor : Prof. Jia-Ling.

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,

A Word Clustering Approach for Language Model-based Sentence Retrieval in Question Answering Systems Saeedeh Momtazi, Dietrich Klakow University of Saarland,Germany.

Supertagging CMSC Natural Language Processing January 31, 2006.

Date: 2012/08/21 Source: Zhong Zeng, Zhifeng Bao, Tok Wang Ling, Mong Li Lee (KEYS’12) Speaker: Er-Gang Liu Advisor: Dr. Jia-ling Koh 1.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Semantic Compositionality through Recursive Matrix-Vector Spaces

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.

Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.

CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,

E-commerce in Your Inbox Product Recommendations at Scale

Detecting Remote Evolutionary Relationships among Proteins by Large-Scale Semantic Embedding Xu Linhe 14S

Overview of Statistical NLP IR Group Meeting March 7, 2006.

Data Mining and Text Mining. The Standard Data Mining process.

Medical Semantic Similarity with a Neural Language Model Dongfang Xu School of Information Using Skip-gram Model for word embedding.

Ensembling Diverse Approaches to Question Answering

ORec : An Opinion-Based Point-of-Interest Recommendation Framework

Jonatas Wehrmann, Willian Becker, Henry E. L. Cagnini, and Rodrigo C

Approaches to Machine Translation

Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.

Vector-Space (Distributional) Lexical Semantics

Mining the Data Charu C. Aggarwal, ChengXiang Zhai

Summarizing answers in non-factoid community Question-answering

A Large Scale Prediction Engine for App Install Clicks and Conversions

MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.

Approaches to Machine Translation

Date : 2013/1/10 Author : Lanbo Zhang, Yi Zhang, Yunfei Chen

Natural Language to SQL(nl2sql)

Topic: Semantic Text Mining

Visual Grounding.

A Neural Passage Model for Ad-hoc Document Retrieval

Presentation transcript:

Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan Tzang

Outline Introduction Introduction Method Experiment Conclusion 2

Introduction Motivation › Determining semantic similarity between texts is important in many tasks in information retrieval such as search, query suggestion, automatic summarization and image ﬁnding. › Many approaches have been suggested, based on lexical matching, handcrafted patterns, syntactic parse trees, external sources of structured semantic knowledge and distributional semantics. 3

Introduction Motivation › lexical features, like string matching, do not capture semantic similarity beyond a trivial level. › handcrafted patterns and external sources of structured semantic knowledge cannot be assumed to be available in all circumstances and for all domains. › Approaches depending on parse trees are restricted to syntactically well-formed texts, typically of one sentence in length. 4

Introduction purpose › We aim for a generic model, that requires no prior knowledge of natural language (such as parse trees) and no external resources of structured semantic information. › From word-level to short-text-level semantics by combining insights from methods based on external sources of semantic knowledge with word embeddings. 5

Introduction purpose › Recent developments in distributional semantics, in particular neural network-based approaches only require a large amount of unlabelled text data. This data is used to create a, socalled, semantic space. › Terms are represented in this semantic space as vectors that are called word embeddings 6

Outline Introduction Method Method Experiment Conclusion 7

Method Input : Sentence Pairs From word-level semantics to Short-text-level 1.Saliency-weighted semantic network 2.Unweighted semantic network Text level Features: 1.Distance between vectors means 2. Bins of dimensions 8

Method 9

Saliency-weighted semantic network › We want a way of taking into account the distribution of terms in one short text in the semantic space compared to distribution of terms in another text. › semantic text similarity (sts) is: 10

Method Saliency-weighted semantic network › s l : longest text of the pair › s s : shortest text of the pair › avgsl: average sentence length in the training corpus. 11

Method Saliency-weighted semantic network › The semantic similarity of term w with respect to short text s is represented by sem(w,s): › The function f sem returns the semantic similarity between two terms. 12

Method Unweighted semantic network › For a short text pair (s 1,s 2 ), we compute the cosine similarities in the semantic space between all terms in short text s1 and all terms in s2. › This gives us a matrix of similarities between the terms in s1 and s2. From this matrix we compute two sets of features. › we take all similarities and bin them. 13

Method Unweighted semantic network › the maximum similarity for every word is computed, and bins are made of these maximum values. › small distances between words end up in the same bin, while outliers end up in a separate bin. 14

Method Text level feature › Distance between vectors means a standard way of combining word embeddings to capture the meaning of longer pieces of text is to take the mean of the individual term vectors. This aggregation over terms gives us one vector per sentence. We calculate both the cosine similarity and the Euclidean distance between the vectors for every sentence pair in the test set. 15

Method Text level feature › Bins of dimensions The cosine similarity between two vectors can be interpreted as an aggregation over the differences per dimension. it does not capture all information about the similarities or differences between the two vectors. we make bins of the number of dimensions in the mean vector of s 1 and the mean vector of s 2 that match within certain limits. 16

Outline Introduction Method Experiment Experiment Conclusion 17

Experiment 18

Experiment 19

Experiment 20

Experiment 21

Outline Introduction Method Experiment Conclusion Conclusion 22

Conclusion › We compute features from the word alignment method and from the means of word embeddings, to train a final classifier that predicts a semantic similarity score. › The method makes no use of external sources of structured semantic knowledge nor of linguistic tools, such as parsers. › Instead it uses a word alignment method, and a saliency weighted semantic graph, to go from word- level to text-level semantics. 23

Conclusion › distributional semantics has come to a level where it can be employed by itself in a generic approach for producing features that can be used to yield state-of- the-art performance on the short text similarity task, even if no manually tuned features are added that optimise for a specific test set or domain. › As our method does not depend on NLP tools, it can be applied to domains and languages for which these are sparse. 24