Manuel Gomez Rodriguez Bernhard Schölkopf I NFLUENCE M AXIMIZATION IN C ONTINUOUS T IME D IFFUSION N ETWORKS 29.06.12, ICML ‘12.

Slides:

Advertisements

Similar presentations

1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University

Advertisements

Data and Computer Communications

Greedy best-first search Use the heuristic function to rank the nodes Search strategy –Expand node with lowest h-value Greedily trying to find the least-cost.

A Hierarchical Multiple Target Tracking Algorithm for Sensor Networks Songhwai Oh and Shankar Sastry EECS, Berkeley Nest Retreat, Jan

LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.

Multicast in Wireless Mesh Network Xuan (William) Zhang Xun Shi.

Fast Algorithms For Hierarchical Range Histogram Constructions

Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.

Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.

Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley.

Iterative Optimization and Simplification of Hierarchical Clusterings Doug Fisher Department of Computer Science, Vanderbilt University Journal of Artificial.

Guest lecture II: Amos Fiat’s Social Networks class Edith Cohen TAU, December 2014.

Least Cost Rumor Blocking in Social networks Lidan Fan Computer Science Department the University of Texas at Dallas.

Rumor Routing in Sensor Networks David Braginsky and Deborah Estrin Presented By Tu Tran 1.

Structural Inference of Hierarchies in Networks BY Yu Shuzhi 27, Mar 2014.

Planning under Uncertainty

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

Efficient Informative Sensing using Multiple Robots

INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.

Dynamic lot sizing and tool management in automated manufacturing systems M. Selim Aktürk, Siraceddin Önen presented by Zümbül Bulut.

Presented by Ozgur D. Sahin. Outline Introduction Neighborhood Functions ANF Algorithm Modifications Experimental Results Data Mining using ANF Conclusions.

Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.

The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.

Copyright N. Friedman, M. Ninio. I. Pe’er, and T. Pupko. 2001RECOMB, April 2001 Structural EM for Phylogentic Inference Nir Friedman Computer Science &

1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:

Diffusion in Social and Information Networks Part II W ORLD W IDE W EB 2015, F LORENCE MPI for Software SystemsGeorgia Institute of Technology Le Song.

Computer vision: models, learning and inference

1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 David Balduzzi 1 Bernhard Schölkopf 1 UNCOVERING THE TEMPORAL DYNAMICS.

Efficient Gathering of Correlated Data in Sensor Networks

Network Aware Resource Allocation in Distributed Clouds.

1 On Querying Historical Evolving Graph Sequences Chenghui Ren $, Eric Lo *, Ben Kao $, Xinjie Zhu $, Reynold Cheng $ $ The University of Hong Kong $ {chren,

Models and Algorithms for Complex Networks Power laws and generative processes.

Rate-based Data Propagation in Sensor Networks Gurdip Singh and Sandeep Pujar Computing and Information Sciences Sanjoy Das Electrical and Computer Engineering.

A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.

1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.

1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.

Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.

Directed-Graph Epidemiological Models of Computer Viruses Presented by: (Kelvin) Weiguo Jin “… (we) adapt the techniques of mathematical epidemiology to.

Boltzmann Machine (BM) (§6.4) Hopfield model + hidden nodes + simulated annealing BM Architecture –a set of visible nodes: nodes can be accessed from outside.

Optimal resource assignment to maximize multistate network reliability for a computer network Yi-Kuei Lin, Cheng-Ta Yeh Advisor : Professor Frank Y. S.

Jing (Selena) He and Hisham M. Haddad Department of Computer Science, Kennesaw State University Shouling Ji, Xiaojing Liao, and Raheem Beyah School of.

Towards Efficient Large-Scale VPN Monitoring and Diagnosis under Operational Constraints Yao Zhao, Zhaosheng Zhu, Yan Chen, Northwestern University Dan.

Manuel Gomez Rodriguez Structure and Dynamics of Information Pathways in On-line Media W ORKSHOP M ENORCA, MPI FOR I NTELLIGENT S YSTEMS.

1 Branch and Bound Searching Strategies Updated: 12/27/2010.

Online Social Networks and Media

I NFORMATION C ASCADE Priyanka Garg. OUTLINE Information Propagation Virus Propagation Model How to model infection? Inferring Latent Social Networks.

Comparison of Tarry’s Algorithm and Awerbuch’s Algorithm CS 6/73201 Advanced Operating System Presentation by: Sanjitkumar Patel.

A Latent Social Approach to YouTube Popularity Prediction Amandianeze Nwana Prof. Salman Avestimehr Prof. Tsuhan Chen.

1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.

1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 Bernhard Schölkopf 1 S UBMODULAR I NFERENCE OF D IFFUSION NETWORKS FROM.

DEPARTMENT/SEMESTER ME VII Sem COURSE NAME Operation Research Manav Rachna College of Engg.

F EATURE -E NHANCED P ROBABILISTIC M ODELS FOR D IFFUSION N ETWORK I NFERENCE Stefano Ermon ECML-PKDD September 26, 2012 Joint work with Liaoruo Wang and.

An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.

March 7, Using Pattern Recognition Techniques to Derive a Formal Analysis of Why Heuristic Functions Work B. John Oommen A Joint Work with Luis.

Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.

1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.

Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University

Privacy Vulnerability of Published Anonymous Mobility Traces Chris Y. T. Ma, David K. Y. Yau, Nung Kwan Yip (Purdue University) Nageswara S. V. Rao (Oak.

Machine learning for Dynamic Social Network Analysis

Inferring Networks of Diffusion and Influence

Independent Cascade Model and Linear Threshold Model

Machine learning for Dynamic Social Network Analysis

Manuel Gomez Rodriguez

Course: Autonomous Machine Learning

Independent Cascade Model and Linear Threshold Model

Mixture of Mutually Exciting Processes for Viral Diffusion

Markov Random Fields Presented by: Vladan Radosavljevic.

Cost-effective Outbreak Detection in Networks

Independent Cascade Model and Linear Threshold Model

Human-centered Machine Learning

Presentation transcript:

Manuel Gomez Rodriguez Bernhard Schölkopf I NFLUENCE M AXIMIZATION IN C ONTINUOUS T IME D IFFUSION N ETWORKS , ICML ‘12

2 Propagation Social Networks Recommendation Networks Epidemiology Human Travels Information Networks P ROPAGATION TAKES PLACE ON W E CAN EXTRACT PROPAGATION TRACES FROM

3 Influence Maximization T 0 Our aim: Find the optimal source nodes that maximize the average number of infected nodes by T: T 0

4 Continuous time vs sequential model Propagation has been modeled as sequential rounds in discrete time steps i i j j Probability of transmission TRADITIONALLY… HOWEVER, REAL TIME MATTERS… i i j j Likelihood of transmission Propagation is modeled as a continuous process with different rates IN OUR WORK… #greece retweet s

5 Influence Maximization: Outline 1.Describe the evolution of a diffusion process mathematically 2.Analytically compute the influence in continuous time 3.Efficiently find the source nodes that maximizes influence 4.Validate I NFLU M AX on synthetic and real diffusion data

Set of source nodes A: Nodes in which a diffusion process starts 6 Sources and sink node Sink node n Source node n Sink node n: Node under study. We aim to evaluate its probability of infection before T given A: P(t n < T | A)

Domination: disabled nodes  Given a sink node n and a set of infected nodes I, we define the disabled nodes S n (I) dominated by I: Sink node n Infected node Disabled node n 6 7 A node u is disabled if any path from u to n visits at least one infected node

8 Self domination: disabled sets  We define the set of self-dominant disabled sets : Sink node n Infected We can find them all efficiently! Nodes in a self-dominant disable set only block themselves relative to the sink node n

9 Monitoring a diffusion process  Given a sink node n and a diffusion process starting in a source set A: We show that a diffusion process can be described by the state space of self-dominant disabled sets This means that… The probability of infecting the sink node n before T is the probability of reaching a specific disabled set before T

10 Diffusion process: continuous time  Given a diffusion process, how fast do we traverse from one self-dominant disable set to another? It depends on how quickly information propagates between each pair of nodes Disabled set I Disabled set II Disabled set III Disabled set IV i i j j Likelihood of transmission

11 Diffusion process as a CTMC Theorem. Given a source set A, a sink node n and independent exponential likelihoods, the process is a CTMC with state space This means that… The probability of infection of the sink node n before T is a phase type distribution Exp matrix depends on and T

12 Computing influence We know how to compute the probability of infection of any sink node n before T INFLUENC E We sum up over all nodes in the network

13 Maximizing the influence  Until now, we show how to compute influence. However, our aim is to find the optimal set of sources nodes A that maximizes influence:  Unfortunately, Theorem. The continuous time influence maximization problem defined by Eq. (1) is NP-hard. (1)

14 Submodular maximization  There is hope! The influence function satisfies a natural diminishing property: Theorem. The influence function is a submodular function in the set of nodes A. We can efficiently obtain a suboptimal solution with a 63% provable guarantee using a greedy algorithm

15 Experimental Setup  We validate our method on:  What is the optimal source set that maximizes influence?  How does the optimal source set change with T?  How fast is the algorithm? Synthetic data 1. Generate network structure 2. Assign transmission rate to each edge in the network 3. Run I NFLU M AX Real data 1. MemeTracker data (172m news articles 08/2009– ) 2. We infer diffusion networks from hyperlink or memes cascades 3. Run I NFLU M AX

16 Influence vs. number of sources 1024-node Hierarchical Kronecker 1024-node Forest Fire 512-node Random Kronecker  Performance does not depend on the network structure:  Synthetic Networks: Forest Fire, Kronecker, etc.  I NFLU M AX typically outputs source sets that results in a 20% higher influence than competitive methods!

17 Influence vs. time horizon T = 0.1 T = 1 The source set outputted by I NFLU M AX can change dramatically with the time horizon T For which time horizon does I NFLU M AX gives the greatest competitive advantage?

18 Influence vs. time horizon  In comparison with other methods, I NFLU M AX performs best for relatively small time horizon node Hierarchical Kronecker network

19 Real data: Influence vs. # of sources 1000-node real network (inferred from hyperlink cascades) 1000-node real network (inferred from MemeTracker cascades)  I NFLU M AX outputs a source set that results in a 20-25% higher influence than competitive methods!

20 Conclusions  We model diffusion and propagation processes in continuous time:  We make minimal assumptions about the physical, biological or cognitive mechanisms responsible for diffusion.  The model uses only the temporal traces left by diffusion.  Including continuous temporal dynamics allows us to evaluate influence analytically using CTMCs.  Once we compute the CTMC, it is straight forward to evaluate how changes in transmission rates impact influence  Natural follow-up: use event history analysis/hazard analysis to generalize our model

21 Thanks! C ODE & M ORE :