A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky (Technion) Eugene Agichtein (Emory) Evgeniy Gabrilovich (Yahoo!

Slides:



Advertisements
Similar presentations
A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch.
Advertisements

Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme Presented by Smitashree Choudhury.
A UTOMATICALLY A CQUIRING A S EMANTIC N ETWORK OF R ELATED C ONCEPTS Date: 2011/11/14 Source: Sean Szumlanski et. al (CIKM’10) Advisor: Jia-ling, Koh Speaker:
Engeniy Gabrilovich and Shaul Markovitch American Association for Artificial Intelligence 2006 Prepared by Qi Li.
Kira Radinsky, Sagie Davidovich, Shaul Markovitch Computer Science Department Technion – Israel Institute of technology.
Patch to the Future: Unsupervised Visual Prediction
 Andisheh Keikha Ryerson University Ebrahim Bagheri Ryerson University May 7 th
Scott Wen-tau Yih (Microsoft Research) Joint work with Vahed Qazvinian (University of Michigan)
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Latent Semantic Analysis
Predicting Text Quality for Scientific Articles Annie Louis University of Pennsylvania Advisor: Ani Nenkova.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures Written by Alexander Budanitsky Graeme Hirst Retold by.
Scalable Text Mining with Sparse Generative Models
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
Ontology Learning and Population from Text: Algorithms, Evaluation and Applications Chapters Presented by Sole.
ERC StG: Multilingual Joint Word Sense Disambiguation (MultiJEDI) Roberto Navigli 1 A Graph-based Algorithm for Inducing Lexical Taxonomies from Scratch.
A Random Graph Walk based Approach to Computing Semantic Relatedness Using Knowledge from Wikipedia Presenter: Ziqi Zhang OAK Research Group, Department.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Institute for System Programming of RAS.
Bridge Semantic Gap: A Large Scale Concept Ontology for Multimedia (LSCOM) Guo-Jun Qi Beckman Institute University of Illinois at Urbana-Champaign.
Exploiting Wikipedia as External Knowledge for Document Clustering Sakyasingha Dasgupta, Pradeep Ghosh Data Mining and Exploration-Presentation School.
Suggesting Friends using the Implicit Social Graph Maayan Roth et al. (Google, Inc., Israel R&D Center) KDD’10 Hyewon Lim 1 Oct 2014.
A hybrid method for Mining Concepts from text CSCE 566 semester project.
Combining Lexical Semantic Resources with Question & Answer Archives for Translation-Based Answer Finding Delphine Bernhard and Iryna Gurevvch Ubiquitous.
Newsjunkie: Providing Personalized Newsfeeds via Analysis of Information Novelty Gabrilovich et.al WWW2004.
Beyond Co-occurrence: Discovering and Visualizing Tag Relationships from Geo-spatial and Temporal Similarities Date : 2012/8/6 Resource : WSDM’12 Advisor.
Multi-Prototype Vector Space Models of Word Meaning __________________________________________________________________________________________________.
Special Topics in Text Mining Manuel Montes y Gómez University of Alabama at Birmingham, Spring 2011.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Eric H. Huang, Richard Socher, Christopher D. Manning, Andrew Y. Ng Computer Science Department, Stanford University, Stanford, CA 94305, USA ImprovingWord.
Annotating Words using WordNet Semantic Glosses Julian Szymański Department of Computer Systems Architecture, Faculty of Electronics, Telecommunications.
A Machine Learning Approach to Sentence Ordering for Multidocument Summarization and Its Evaluation D. Bollegala, N. Okazaki and M. Ishizuka The University.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
10/22/2015ACM WIDM'20051 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis Voutsakis.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Intelligent Database Systems Lab Presenter : YAN-SHOU SIE Authors Mohamed Ali Hadj Taieb *, Mohamed Ben Aouicha, Abdelmajid Ben Hamadou KBS Computing.
The More the Better? Assessing the Influence of Wikipedia’s Growth on Semantic Relatedness Measures Torsten Zesch and Iryna Gurevych Ubiquitous Knowledge.
What Helps Where – And Why? Semantic Relatedness for Knowledge Transfer Marcus Rohrbach 1,2 Michael Stark 1,2 György Szarvas 1 Iryna Gurevych 1 Bernt Schiele.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Information Retrieval with Time Series Query Hyun Duk Kim (now at Twitter), Danila Nikitin (now at Google), ChengXiang Zhai University of Illinois at Urbana-Champaign.
k-Shape: Efficient and Accurate Clustering of Time Series
Instance-based mapping between thesauri and folksonomies Christian Wartena Rogier Brussee Telematica Instituut.
Evgeniy Gabrilovich and Shaul Markovitch
1 A Web Search Engine-Based Approach to Measure Semantic Similarity between Words Presenter: Guan-Yu Chen IEEE Trans. on Knowledge & Data Engineering,
2015/12/121 Extracting Key Terms From Noisy and Multi-theme Documents Maria Grineva, Maxim Grinev and Dmitry Lizorkin Proceeding of the 18th International.
A Knowledge-Based Search Engine Powered by Wikipedia David Milne, Ian H. Witten, David M. Nichols (CIKM 2007)
Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC
INFORMATION RETRIEVAL PROJECT Creation of clusters of concepts that represent a domain corpus.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Comparing Word Relatedness Measures Based on Google n-grams Aminul ISLAM, Evangelos MILIOS, Vlado KEŠELJ Faculty of Computer Science Dalhousie University,
Finding document topics for improving topic segmentation Source: ACL2007 Authors: Olivier Ferret (18 route du Panorama, BP6) Reporter:Yong-Xiang Chen.
2/10/2016Semantic Similarity1 Semantic Similarity Methods in WordNet and Their Application to Information Retrieval on the Web Giannis Varelas Epimenidis.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Enhancing Text Clustering by Leveraging Wikipedia Semantics.
Corpus Exploitation from Wikipedia for Ontology Construction Gaoying Cui, Qin Lu, Wenjie Li, Yirong Chen The Department of Computing The Hong Kong Polytechnic.
Cross-modal Hashing Through Ranking Subspace Learning
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Collaborative Deep Learning for Recommender Systems
An Integrated Approach for Relation Extraction from Wikipedia Texts Yulan Yan Yutaka Matsuo Mitsuru Ishizuka The University of Tokyo WWW 2009.
On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore.
TDM in the Life Sciences Application to Drug Repositioning *
Ying Dai Faculty of software and information science,
Ying Dai Faculty of software and information science,
Presentation transcript:

A Word at a Time: Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky (Technion) Eugene Agichtein (Emory) Evgeniy Gabrilovich (Yahoo! Research) Shaul Markovitch (Technion) WWW 2011 Presented by Tom 1

Semantic relatedness of texts Given two texts, quantify their semantic relatedness Used in many NLP tasks:  Information retrieval  Word-sense disambiguation  Text clustering  Error correction Motivation 2

Ontologies and concepts Background An ontology is a collection of concepts, for example: 1. Wikipedia as an ontology  Every Wikipedia article represents a concept  A word (or longer text fragment) can be represented as a vector of related Wikipedia concepts (using ESA) 2. Flickr as an ontology  Every Flickr tag represents a concept  A word can be represented as a vector of co-occurring Flickr tags 3

Current state of the art (Concept-based representations)  Path-based measures using Wikipedia categories WikiRelate! (Strube,2006)  Co-occurrence based measures Latent Semantic Analysis (Deerwester et al., 1990)  WordNet-based measures Multiple measures formulated in the literature(see Budanitsky & Hirst, 2001, for a comprehensive review)  Vector space models Explicit Semantic Analysis (Gabrilovich & Markovitch, 2007) In ESA, a fragment of text is represented as a weighted vector of Wikipedia concepts. All these approaches are based on a static corpus. Can the temporal dynamics observed in a corpus be used to enhance text relatedness models? Background 4

Our solution: Intuition Peace War Intuition Temporal co-appearance of “ war ” and “ peace ” in NYT archives

Our solution: Intuition Crude oil Stock Intuition Temporal co-appearance of “ crude oil ” and “ stock ” in NYT archives

Overview: Temporal semantic analysis 1. Novel temporal representation of text 2. Novel temporal text-similarity measurement Our solution: TSA Word c1c1 cncn Extend static representation with temporal dynamics Represent words as concept vectors Word 1 Method for computing semantic relatedness using the temporal representation Word 2 7

C1C1 cncn C1C1 cncn TSA: Representation Static vector space representation Words are represented as concept vectors: using a concept repository of choice (e.g., Wikipedia or Flickr image tags) armysoldier war peace war peace 8

C1C1 cncn C1C1 cncn NYT Frequency TSA: Representation Temporal vector space representation Extract temporal dynamics for each concept armysoldier war peace 9

C1C1 cncn G1G1 GkGk TSA: Representation Temporal vector space representation Temporal representations of words can be different, but related words tend to have similar temporal representations army soldier war peace war peace 10

Overview: Temporal semantic analysis 1. Novel temporal representation of text 2. Novel temporal text-similarity measurement Our solution: TSA Word c1c1 cncn Extend static representation with temporal dynamics Represent words as concept vectors Word1 Method for computing semantic relatedness using the temporal representation Word 2 11

Static semantic similarity (as in ESA) C1C1 cncn C1C1 cncn TSA: Computing Similarity 12

C1C1 cncn C1C1 cncn Measure distance between time-series NYT Frequency TSA: Computing Similarity Temporal semantic similarity (TSA) 13

D( A, B ) = d(p s ) : distance between i t and j t w(t) > 0 : weighting coefficient (with decay over time) Best alignment path between A and B : Time-weighted distance between A and B : jtjt itit m 1 n1 pkpk psps p1p1 P 0 = (D( A, B )). TSA: Computing Similarity Time Series A Time Series B Temporal distances (Method 1): Temporal-weighted dynamic time warping (DTW) 14

TSA: Computing Similarity D( A, B ) = w(t) > 0 : weighting coefficient (with decay over time) Best alignment path between A and B : Time-weighted distance between A and B : P 0 = (D( A, B )). Temporal distances (Method 2): Temporal-weighted cross correlation 15 An innate characteristic of this measure is identification of similar time series in volume, with consideration of time shifts.

C1C1 cncn G1G1 GkGk TSA: Computing Similarity Reminder: Temporal distance between different concepts The sets of support concepts on both sides are DIFFERENT Maximum sum of pairwise concept relatedness over all ordered subset of size n of C(t2) 16

Greedy temporal distance function TSA: Computing Similarity 17

In our experiments we have used two datasets: 1. WS-353 dataset : standard in the field.  353 pairs of words (manually selected)  Each pair judged by 13 or16 human annotators 2. MTurk dataset : a new dataset, in which pairs of words are selected automatically  287 pairs of words  Each pair judged by 23 human annotators Word-similarity benchmarks Evaluation Evaluation metric: correlation with human judgments is the most commonly used metric 18

Results Main result: TSA outperform ESA TSA algorithm vs. state-of-the-art (MTurk dataset) TSA algorithm vs. state-of-the-art (WS-353 dataset) On both datasets our algorithm outperform the state of the art. AlgorithmCorrelation with humans ESA-Wikipedia0.75 ESA-ODP0.65 TSA0.80 AlgorithmCorrelation with humans ESA-Wikipedia0.59 TSA

Results: Analysis TSA outperforms ESA mainly on low word frequency Grouping word pairs by frequency (WS-353 dataset) Type of BucketESA correlation with humans TSA correlation with humans Rare Medium High

Strength of TSA: Synonyms (“boy” & “lad”) Results: Qualitative analysis Human ranking: 16 TSA Ranking: 62 ESA ranking: 155 synonyms have similar patterns of occurrence over time, as writers in the news corpus tend to use them interchangeably word pairs are ordered by similarity (WS353 dataset) 21

Strength of TSA: Compound terms (“luxury” & “car”) Results: Qualitative analysis Human ranking: 164 TSA Ranking: 118 ESA ranking: 12 TSA captures co-occurrences of words in a single article, as we construct time-series aggregated over all articles on a certain date. word pairs are ordered by similarity (WS353 dataset) 22

Strength of TSA: Implicit relations (“closet” & “clothes”) Results: Qualitative analysis Human ranking: 57TSA Ranking: 56 ESA ranking:173 Additional Implicit relations: summer-draught, canyon-landscape etc. word pairs are ordered by similarity (WS353 dataset) 23

Results: Qualitative analysis Limitations of TSA Complex implicit relations (“drink” & “car”) News corpus bias - coverage problem (“physics” & “proton”) Human ranking: 303 TSA ranking: 150 ESA ranking: 313 Human ranking: 56 TSA ranking: 322 ESA ranking: 55 24

1. Temporal Semantic Analysis main contributions:  Semantic representation of natural language terms using temporal corpus (NYT ).  Semantic relatedness distance algorithms using temporal data. 2. Automatic algorithm for semantic relatedness datasets construction. 3. Empirical evaluation confirms using TSA outperforms current state of the art. 4. Many other temporal datasets: Nature and Science archives, Google Books, and more. Temporal information holds a lot of promise for NLP tasks Summary 25

 Word relatedness datasets 1. WS353 : ww.cs.technion.ac.il/~gabr/resources/data/wordsim353 ww.cs.technion.ac.il 2. MTurk:  For any questions please to Kira Radinsky Thank you! References + supplemental materials References 26