Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch.

Similar presentations


Presentation on theme: "A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch."— Presentation transcript:

1 A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch

2 Introduction A rich source of information can be revealed by studying the patterns of word occurrence over time Example: peace and war Corpus: New York Times over 130 years Word time series of its occurrence in NYT articles Hypothesis: Correlation between 2 words time series Semantic Relation Proposed method: Temporal Semantic Analysis (TSA)

3 Introduction

4

5 1. TSA

6 Temporal Semantic Analysis 3 main steps: 1.Represent words as concepts vectors 1.Extract temporal dynamics for each concept 1.Extend static representation with temporal dynamics

7 1. Words as concept vectors

8 2. Temporal dynamics c : concept represented by a sequence of words wc 1,…,wc k d : a document ε : proximity relaxation parameter (ε = 20 in the experiments) c appears in d if its words appear in d with a distance of at most ε words between each pair wc i, wc j Example: Great Fire of London

9 2. Temporal dynamics t 1,…,t n : a sequence of consecutive discrete time points (days) H = D 1,…,D n : history represented by a set of document collections, where D i is a collection of documents associated with time t i the dynamics of a concept c is the time series of its frequency of appearance in H

10 3. Extend static representation

11 2. Using TSA for computing Semantic Relatedness

12 Using TSA for computing Semantic Relatedness Compare by weighted distance between time series of concept vectors Combine it with the static semantic similarity measure

13 Algorithm t 1, t 2 : words C(t 1 ) = {c 1,…,c n }and C(t 2 ) = {c 1,…,c m }: sets of concepts of t 1 and t 2 Q(c 1,c 2 ) : function that determines relatedness between two concepts c 1 and c 2 using their dynamics (time series)

14 Algorithm

15 Cross Correlation Pearson's product-moment coefficient: A statistic method for measuring similarity of two random variables Example: computer and radio

16 Dynamic Time Warping Measure similarity between 2 time series that may differ in time scale but similar in shape Used in speech recognition It defines a local cost matrix Temporal Weighting Function

17 3. Experimentations

18 Experimentations: Setup New York Times archive (1863 – 2004) Each day: average of 50 abstracts of article 1.42 Gb of texts 565 540 distinct words A new algorithm to automatically benchmark word relatedness tasks Same vector representation for each method tested Comparison to human judgment (WS-353 and Amazon MTurk)

19 TSA vs. ESA

20 TSA vs. Temporal Word Similarity

21 Word Frequency Effects

22 Size of Temporal Concept Vector

23 Conclusion Two innovations: o Temporal Semantic Analysis o A new method for measuring semantic relatedness of terms Many advantages (robustness, tunable, can be used to study language evolution over time) Significant improvements in computing words relatedness


Download ppt "A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch."

Similar presentations


Ads by Google