Bursts in social networks Information Burst Unusual activity compared to the average case Types External source case Cases where users act independently Example: earthquake, a soccer game Influence case The activity of neighbors has an influence on the activity of a node Example: protests in Egypt 3
External source case Naïve approach: assign states to nodes separately (in isolation) 5 N NB f B = 0.3 f N = 0.4 f B = 0.8 f N = 0.1 f B = 0.9 f N =0.1 f B = 0.1 f N = 0.6 B Assign burst state to each node of the graph!
Problem with the Naïve approach Goal: group bursty users The Naïve approach does not consider the links Similar behavior for nodes in a same neighborhood Fragmentation 6 B B B N N N N N N B B
Intrinsic burst model 7 B B N f B = 0.3 f N = 0.4 f B = 0.8 f N = 0.1 f B = 0.9 f N =0.1 f B = 0.1 f N = 0.6 B A similar problem for the case of time-series data Generalize
The Optimal Algorithm 8 ts W = log(1/f N (u)) W = log(1/f B (u)) W = log( λ /1- λ ) bursty non-bursty GG
DIBA: Dynamic programming Intrinsic Burst detection Algorithm Simplify the problem Remove some edges to create a tree Predict edges with the highest impact on the final state assignment DIBA is a linear optimal solution for this simplified problem. 9
Experiments Dataset 30 days of Twitter fire hose Each day contains 30 million distinct users (connected with 1.5 billion edges) generating 300-400 million tweets Identify bursty subgraphs for hot topics 11
Run time results Both algorithms are fast DIBA: 2.5 minutes SODA: 2 hours 12 Please see the paper for the figures on the sensitivity of DIBA and SODA (the run time and the quality of the results) on different parameters.
Sample qualitative results Grand Prix Fans of Car race across different countries talking about Canadian formula1 grand prix groups of Volleyball related twitter handles tweeting about FIVB volleyball world grand prix groups of twitter handles tweeting about Womens Thailand Open Grand Prix Gold badminton tournament. 13
Future works Exploring applications of our techniques in web search domain e.g. utilizing the subgraphs detected for grand prix to address problems such as Diversified search Query expansion Dynamics of the bursty subgraphs 14