Mention-anomaly-based Event Detection and Tracking in Twitter Adrien Guille & Cécile Favre ERIC Lab, University of Lyon 2, France IEEE/ACM ASONAM 2014,

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

You have been given a mission and a code. Use the code to complete the mission and you will save the world from obliteration…
Use of Social Media by Swiss Higher Ed Initiative of the State Secretariat for Education and Research SER Annex of the Consulate General. Swiss Knowledge.
Advanced Piloting Cruise Plot.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
Summary of Convergence Tests for Series and Solved Problems
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Title Subtitle.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Addition Facts
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
Making the System Operational
2010 fotografiert von Jürgen Roßberg © Fr 1 Sa 2 So 3 Mo 4 Di 5 Mi 6 Do 7 Fr 8 Sa 9 So 10 Mo 11 Di 12 Mi 13 Do 14 Fr 15 Sa 16 So 17 Mo 18 Di 19.
ZMQS ZMQS
Lost in Translation Measuring and Managing GOOD Web Intentions Marilyn Harmacek. 1.
Richmond House, Liverpool (1) 26 th January 2004.
Trade Promotion Management Study Summary Charts
Spoofing State Estimation
Randomized Algorithms Randomized Algorithms CS648 1.
ABC Technology Project
3 Logic The Study of What’s True or False or Somewhere in Between.
1 Undirected Breadth First Search F A BCG DE H 2 F A BCG DE H Queue: A get Undiscovered Fringe Finished Active 0 distance from A visit(A)
2 |SharePoint Saturday New York City
VOORBLAD.
15. Oktober Oktober Oktober 2012.
1 Breadth First Search s s Undiscovered Discovered Finished Queue: s Top of queue 2 1 Shortest path from s.
Landmark-Based User Location Inference in Social Media YUTO YAMAGUCHI †, TOSHIYUKI AMAGASA † AND HIROYUKI KITAGAWA † †UNIVERSITY OF TSUKUBA 13/10/08 COSN.
Copyright © 2013, 2009, 2006 Pearson Education, Inc.
BIOLOGY AUGUST 2013 OPENING ASSIGNMENTS. AUGUST 7, 2013  Question goes here!
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Squares and Square Root WALK. Solve each problem REVIEW:
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
Copyright © 2013, 2009, 2006 Pearson Education, Inc. 1 Section 5.4 Polynomials in Several Variables Copyright © 2013, 2009, 2006 Pearson Education, Inc.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
Chapter 5 Test Review Sections 5-1 through 5-4.
Addition 1’s to 20.
25 seconds left…...
Slippery Slope
H to shape fully developed personality to shape fully developed personality for successful application in life for successful.
Januar MDMDFSSMDMDFSSS
Week 1.
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
1 Unit 1 Kinematics Chapter 1 Day
PSSA Preparation.
VPN AND REMOTE ACCESS Mohammad S. Hasan 1 VPN and Remote Access.
Immunobiology: The Immune System in Health & Disease Sixth Edition
Weekly Attendance by Class w/e 6 th September 2013.
Immunobiology: The Immune System in Health & Disease Sixth Edition
CpSc 3220 Designing a Database
People Counting and Human Detection in a Challenging Situation Ya-Li Hou and Grantham K. H. Pang IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART.
22 nd User Modeling, Adaptation and Personalization (UMAP 2014) Time-Sensitive User Profile for Optimizing Search Personalization Ameni Kacem, Mohand Boughanem,
RollCaller: User-Friendly Indoor Navigation System Using Human-Item Spatial Relation Yi Guo, Lei Yang, Bowen Li, Tianci Liu, Yunhao Liu Hong Kong University.
Presentation transcript:

Mention-anomaly-based Event Detection and Tracking in Twitter Adrien Guille & Cécile Favre ERIC Lab, University of Lyon 2, France IEEE/ACM ASONAM 2014, Beijing, China August 20, 2014

What is Twitter & why study it?  Twitter: micro-blogging service  140-character messages  Ever growing number of Twitter users  Pro: Timely source of information  Con: Information overload  How can we use Twitter for automated event detection and tracking? August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 2

Related Work  Idea: spot bursty patterns  Term-weighting-based approaches  Peaky Topics [Shamma11], Trending Score [Benhardus13]  Possible ambiguity, lack of context  Topic-modeling-based approaches  On-line LDA [Lau12], ET-LDA [Yuheng12]  Lack of scalability  Clustering-based approaches  EDCoW [Weng11], TwEvent [Li12], ET [Parikh13]  Noisy event descriptions August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 3

Issues & Proposal August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 4  Shortcomings of existing methods  Event duration is a fixed parameter  Only the textual content of tweets is considered  We propose a novel approach and method that  Dynamically estimate each event duration  Exploit the social aspect of tweet streams through mentions

Proposed Method August 20, A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Problem Formulation  Input  Corpus C containing N tweets partitioned into n time-slices  Vocabularies V and  Output  The k most impactful events August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 6  Event: A bursty topic and a value Mag translating its magnitude of impact  Bursty Topic: A time interval I, a main term t, a set S of weighted related terms

Overview of the proposed method August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 7  Two-phase flow  1: Analyse the mention frequency of each word in to detect events (Mag,I,t, Ø )  2: Select related words and generating the final list of the k most impactful events while controling redundancy  MABED, Mention-Anomaly-Based Event Detection

PHASE 1 Proposed Method August 20, A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Detecting Events with Mention Anomaly August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 9  Computing the anomaly at a point i for word t  Requires computing the expected volume of tweets containing at least one mention and t, at i  Normal distribution:  Expectation:  Anomaly:  Measuring the magnitude of impact  Integrating anomaly:

Detecting Events with Mention Anomaly August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 10  For each word t in  Solve a « Maximum Contiguous Subsequence Sum » type of problem:  Eventually, each event is described by  A main word t  A period of time I  The magnitude of its impact Mag

Detecting Events with Mention Anomaly August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 11  Example

PHASE 2 Proposed Method August 20, A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Selecting Words Describing Events August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 13  Identifying candidate words  Set of p words that co-occur the most with t during I  Selecting the most relevant words  Measure the similarity between candidate words and the main word frequency [Erdem12]  Apply a threshold θ

Selecting Words Describing Events August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 14  Example

Generating the List of Top k Events August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 15  Event graph & redundancy graph  Detecting duplicated events  Connectivity of main terms in the event graph  Overlap between intervals, threshold σ  Merging duplicated events  Identifying connected components in the redundancy graph

Generating the List of Top k Events August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 16  Example

Evaluation August 20, A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Experimental Setup August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 18  Corpora  C(en): 1,437,126 tweets published in November 2009  C(fr): 2,086,136 tweets published in March 2012  Baselines for comparison  Trending Score (TS) [Benhardus13] and ET [Parikh13]  α -MABED  Parameter setting  ( α -)MABED: 30-min time-slices, p=10, θ =0.7, σ =0.5  Trending Score, ET: 1-day time-slices

Evaluation Metrics August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 19  Manual annotation  Two human annotators judging the significancy of the top 40 events detected by each method ( κ = 0.72)  Precision  Significant events / All detected events  Recall  Distinct significant events / All detected events  DERate [Li12]  Duplicated events / Significant events

Quantitative Evaluation August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 20  Performance of the five methods on the two corpora

Quantitative Evaluation August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 21  Impact of σ on MABED

Qualitative Evaluation August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 22  Improved readability  Excerpt of the list of events detected in C(en) by MABED

Qualitative Evaluation August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 23  Improved temporal precision & reduced redundancy  Importance of dynamically estimating events duration  Politics-related events tend to be discussed longer [Romero11]

Included in the open-source social media data mining tool SONDY [Guille13] Implementation August 20, A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter

Time-oriented Interface August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 25

Impact-oriented Interface August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 26

Topic-oriented Interface August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 27

Conclusion & Future Work August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 28  Propose a novel approach and method for detecting events in Twitter  Verified hypothesis  Considering mentions helps detecting significant events  Experimental results on two different datasets demonstrate the accuracy and the robustness of the proposed method  Future work  More features to model discussions between users

References August 20, 2014 A. Guille & C. Favre: Mention-Anomaly-Based Event Detection in Twitter 29  [Shamma11] D. A. Shamma, L. Kennedy, and E. F. Churchill, “Peaks and persistence: modeling the shape of microblog conversations,” in CSCW, 2011  [Benhardus13] J. Benhardus and J. Kalita, “Streaming trend detection in twitter,” IJWBC, vol. 9, no. 1, 2013  [Lau12] J. H. Lau, N. Collier, and T. Baldwin, “On-line trend analysis with topic models: #twitter trends detection topic model online,” in COLING, 2012  [Yuheng12] H.Yuheng, J.Ajita, D.S.Dorée, and W.Fei, “What were the tweets about? topical associations between public events and twitter feeds,” in ICWSM, 2012  [Weng11] J. Weng and B.-S. Lee, “Event detection in twitter,” in ICWSM, 2011  [Li12] C. Li, A. Sun, and A. Datta, “Twevent: Segment-based event detection from tweets,” in CIKM, 2012  [Parikh13] R. Parikh and K. Karlapalem, “Et: events from tweets,” in companion WWW, 2013  [Erdem12] O. Erdem, E. Ceyhan, and Y. Varli, “A new correlation coefficient for bivariate time- series data,” in MAF, 2012  [Guille13] A. Guille, C. Favre, H. Hacid, and D. Zighed, “Sondy: An open source platform for social dynamics mining and analysis,” in SIGMOD, 2013