Download presentation

Presentation is loading. Please wait.

Published byJaden Farrant Modified over 2 years ago

1
A Word at a Time Computing Word Relatedness using Temporal Semantic Analysis Kira Radinsky, Eugene Agichteiny, Evgeniy Gabrilovichz, Shaul Markovitch

2
Introduction A rich source of information can be revealed by studying the patterns of word occurrence over time Example: peace and war Corpus: New York Times over 130 years Word time series of its occurrence in NYT articles Hypothesis: Correlation between 2 words time series Semantic Relation Proposed method: Temporal Semantic Analysis (TSA)

3
Introduction

4

5
1. TSA

6
Temporal Semantic Analysis 3 main steps: 1.Represent words as concepts vectors 1.Extract temporal dynamics for each concept 1.Extend static representation with temporal dynamics

7
1. Words as concept vectors

8
2. Temporal dynamics c : concept represented by a sequence of words wc 1,…,wc k d : a document ε : proximity relaxation parameter (ε = 20 in the experiments) c appears in d if its words appear in d with a distance of at most ε words between each pair wc i, wc j Example: Great Fire of London

9
2. Temporal dynamics t 1,…,t n : a sequence of consecutive discrete time points (days) H = D 1,…,D n : history represented by a set of document collections, where D i is a collection of documents associated with time t i the dynamics of a concept c is the time series of its frequency of appearance in H

10
3. Extend static representation

11
2. Using TSA for computing Semantic Relatedness

12
Using TSA for computing Semantic Relatedness Compare by weighted distance between time series of concept vectors Combine it with the static semantic similarity measure

13
Algorithm t 1, t 2 : words C(t 1 ) = {c 1,…,c n }and C(t 2 ) = {c 1,…,c m }: sets of concepts of t 1 and t 2 Q(c 1,c 2 ) : function that determines relatedness between two concepts c 1 and c 2 using their dynamics (time series)

14
Algorithm

15
Cross Correlation Pearson's product-moment coefficient: A statistic method for measuring similarity of two random variables Example: computer and radio

16
Dynamic Time Warping Measure similarity between 2 time series that may differ in time scale but similar in shape Used in speech recognition It defines a local cost matrix Temporal Weighting Function

17
3. Experimentations

18
Experimentations: Setup New York Times archive (1863 – 2004) Each day: average of 50 abstracts of article 1.42 Gb of texts distinct words A new algorithm to automatically benchmark word relatedness tasks Same vector representation for each method tested Comparison to human judgment (WS-353 and Amazon MTurk)

19
TSA vs. ESA

20
TSA vs. Temporal Word Similarity

21
Word Frequency Effects

22
Size of Temporal Concept Vector

23
Conclusion Two innovations: o Temporal Semantic Analysis o A new method for measuring semantic relatedness of terms Many advantages (robustness, tunable, can be used to study language evolution over time) Significant improvements in computing words relatedness

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google