A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch.

A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch

Introduction A rich source of information can be revealed by studying the patterns of word occurrence over time Example: peace and war Corpus: New York Times over 130 years Word time series of its occurrence in NYT articles Hypothesis: Correlation between 2 words time series Semantic Relation Proposed method: Temporal Semantic Analysis (TSA)

Introduction

1. TSA

Temporal Semantic Analysis 3 main steps: 1.Represent words as concepts vectors 1.Extract temporal dynamics for each concept 1.Extend static representation with temporal dynamics

1. Words as concept vectors

2. Temporal dynamics c : concept represented by a sequence of words wc 1,…,wc k d : a document ε : proximity relaxation parameter (ε = 20 in the experiments) c appears in d if its words appear in d with a distance of at most ε words between each pair wc i, wc j Example: Great Fire of London

2. Temporal dynamics t 1,…,t n : a sequence of consecutive discrete time points (days) H = D 1,…,D n : history represented by a set of document collections, where D i is a collection of documents associated with time t i the dynamics of a concept c is the time series of its frequency of appearance in H

3. Extend static representation

2. Using TSA for computing Semantic Relatedness

Using TSA for computing Semantic Relatedness Compare by weighted distance between time series of concept vectors Combine it with the static semantic similarity measure

Algorithm t 1, t 2 : words C(t 1 ) = {c 1,…,c n }and C(t 2 ) = {c 1,…,c m }: sets of concepts of t 1 and t 2 Q(c 1,c 2 ) : function that determines relatedness between two concepts c 1 and c 2 using their dynamics (time series)

Algorithm

Cross Correlation Pearson's product-moment coefficient: A statistic method for measuring similarity of two random variables Example: computer and radio

Dynamic Time Warping Measure similarity between 2 time series that may differ in time scale but similar in shape Used in speech recognition It defines a local cost matrix Temporal Weighting Function

3. Experimentations

Experimentations: Setup New York Times archive (1863 – 2004) Each day: average of 50 abstracts of article 1.42 Gb of texts 565 540 distinct words A new algorithm to automatically benchmark word relatedness tasks Same vector representation for each method tested Comparison to human judgment (WS-353 and Amazon MTurk)

TSA vs. ESA

TSA vs. Temporal Word Similarity

Word Frequency Effects

Size of Temporal Concept Vector

Conclusion Two innovations: o Temporal Semantic Analysis o A new method for measuring semantic relatedness of terms Many advantages (robustness, tunable, can be used to study language evolution over time) Significant improvements in computing words relatedness

A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch.

Similar presentations

Presentation on theme: "A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch.

Similar presentations

Presentation on theme: "A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch."— Presentation transcript:

Similar presentations

About project

Feedback