Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1

Slides:



Advertisements
Similar presentations
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Advertisements

LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Spread of Influence through a Social Network Adapted from :
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Maximizing the Spread of Influence through a Social Network
Suqi Cheng Research Center of Web Data Sciences & Engineering
In Search of Influential Event Organizers in Online Social Networks
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
Neighborhood Formation and Anomaly Detection in Bipartite Graphs Jimeng Sun Huiming Qu Deepayan Chakrabarti Christos Faloutsos Speaker: Jimeng Sun.
Efficient and Robust Computation of Resource Clusters in the Internet Efficient and Robust Computation of Resource Clusters in the Internet Chuang Liu,
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
Social Network Analysis via Factor Graph Model
Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and.
© Y. Zhu and Y. University of North Carolina at Charlotte, USA 1 Chapter 1: Social-based Routing Protocols in Opportunistic Networks Ying Zhu and.
Survey on Evolving Graphs Research Speaker: Chenghui Ren Supervisors: Prof. Ben Kao, Prof. David Cheung 1.
SOS: A Safe, Ordered, and Speedy Emergency Navigation Algorithm in Wireless Sensor Networks Andong Zhan ∗ †, Fan Wu ∗, Guihai Chen ∗ ∗ Shanghai Key Laboratory.
Personalized Influence Maximization on Social Networks
Adaptive CSMA under the SINR Model: Fast convergence using the Bethe Approximation Krishna Jagannathan IIT Madras (Joint work with) Peruru Subrahmanya.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Diversified Top-k Graph Pattern Matching 1 Yinghui Wu UC Santa Barbara Wenfei Fan University of Edinburgh Southwest Jiaotong University Xin Wang.
On Energy-Efficient Trap Coverage in Wireless Sensor Networks Junkun Li, Jiming Chen, Shibo He, Tian He, Yu Gu, Youxian Sun Zhejiang University, China.
December 7-10, 2013, Dallas, Texas
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Eva Tardos Cornell University KDD 2003.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
Robustness of complex networks with the local protection strategy against cascading failures Jianwei Wang Adviser: Frank,Yeong-Sung Lin Present by Wayne.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
1 Panther: Fast Top-K Similarity Search on Large Networks Jing Zhang 1, Jie Tang 1, Cong Ma 1, Hanghang Tong 2, Yu Jing 1, and Juanzi Li 1 1 Department.
Online Social Networks and Media
Community Detection Algorithms: A Comparative Analysis Authors: A. Lancichinetti and S. Fortunato Presented by: Ravi Tiwari.
On Bharathi-Kempe-Salek Conjecture about Influence Maximization Ding-Zhu Du University of Texas at Dallas.
Panther: Fast Top-k Similarity Search in Large Networks JING ZHANG, JIE TANG, CONG MA, HANGHANG TONG, YU JING, AND JUANZI LI Presented by Moumita Chanda.
1 Latency-Bounded Minimum Influential Node Selection in Social Networks Incheol Shin
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
1 1 MPI for Intelligent Systems 2 Stanford University Manuel Gomez Rodriguez 1,2 Bernhard Schölkopf 1 S UBMODULAR I NFERENCE OF D IFFUSION NETWORKS FROM.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Inferring Networks of Diffusion and Influence
Prof. Yu-Chee Tseng Department of Computer Science
Cohesive Subgraph Computation over Large Graphs
Wenyu Zhang From Social Network Group
Nanyang Technological University
Finding Dense and Connected Subgraphs in Dual Networks
Independent Cascade Model and Linear Threshold Model
Greedy & Heuristic algorithms in Influence Maximization
Social influence: models & analysis
MEIKE: Influence-based Communities in Networks
A Study of Group-Tree Matching in Large Scale Group Communications
Greedy Algorithm for Community Detection
Learning Influence Probabilities In Social Networks
Independent Cascade Model and Linear Threshold Model
The Importance of Communities for Learning to Influence
Effective Social Network Quarantine with Minimal Isolation Costs
Discovering Functional Communities in Social Media
Statistical properties of network community structure
Structural influence:
Cost-effective Outbreak Detection in Networks
3.3 Network-Centric Community Detection
A History Sensitive Cascade Model in Diffusion Networks
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
Lecture 2-6 Complexity for Computing Influence Spread
Presentation transcript:

Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1 Community-based Greedy Algorithm for Mining Top-K Influential Nodes in Mobile Social Networks Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1 1 Peking University, China 2 Nanyang Technological University, Singapore

Problem and Background Problem: Given a mobile social network, we aim to mine a set of top-K influential nodes S such that R(S) is maximized using the extended Independent Cascade information diffusion model. A mobile social network plays an essential role as the spread of information and influence in the form of "word-of-mouth“ The problem is NP-hard. computationally expensive to run the greedy algorithm on a large network. The previous greedy algorithms take days to finish on 723k nodes a network with 723k nodes in our experiments.

Basic Idea of the Algorithm Dynamic programming Algorithm & greedy algorithm on selected communities Construct Network from CDR (call detailed record) Community Detection: it based on diffusion Model on MSN

Step1: Extracting Mobile Social Network Extract a Mobile Social Network from CDR data and model it as a directed weighted graph A phone user -- a node A directed edge u  v is established, if there exits communication from u to v communication time -- the weight of the edge A phone user corresponds to a node A directed edge from node u to node v is established, if there exits communication from u to v corresponding communication time as the weight of the edge

Extended Independent Cascade Model Two states of nodes Active & inactive Diffusion speed λ When an active node vi contacts an inactive node vj , the inactive node becomes active at a probability (rate) λij.

Extended Independent Cascade Model

Step2: Influential Model Based Community Detection Algorithm Community Partition Each node is assigned a unique community label from 1 to N For each node compute the set of its influenced neighbors using Independent Cascade diffusion model Iteratively propagate the labels through the network in finite iterations for each node v ,the label of the community that the majority of its influenced neighbors belong to  the label of v Community Combination the difference between the node’s influence degree in its community and its influence degree in the network is smaller than a threshold. t denotes the tth iteration sv denotes the number of neighneighbors that are influenced by v ui (i ∈ [1, sv]) represents an influenced neighbor of node v ui.Ct−1 represents the community label of ui at iteration t-1 maxCMT is to compute the majority label of ui.Ct−1

Step3: Community-Based Greedy Algorithm Choose communities to find the Top-1 influential node C1 C2 ΔR2=0.3 ΔR1=0.2 R[1,1]=max{R[0,1], R[3,0]+ΔR1}=0.2 s[1,1]=C1; R[2,1]=max{R[1,1], R[3,0]+ ΔR2}=0.3 s[2,1]=C2; R[3,1]=max{R[2,1], R[3,0]+ ΔR3}=0.3 s[3,1]=C2; So we mine top-1 node in C2 ΔR3=0.1 C3

Community-Based Greedy Algorithm Choose communities to find the Top-2 influential node C1 C2 ΔR2=0.06 ΔR1=0.2 Note ΔR2 is 0.06, but not 0.3. R[1,2]= max{R[0,2], R[3,1]+ΔR1}=0.5 s[1,2]=C1; R[2,2]= max{R[1,2], R[3,1]+ΔR2}=0.5 s[2,2]=C1; R[3,2]= max{R[2,2], R[3,1]+ΔR3}=0.5 s[3,2]=C1; We mine the second node in C1 ΔR3=0.1 C3

CGA:Community-Based Greedy Algorithm Θ and Δd are two constants

Experiments Data Sets Extract a Mobile Social Network from a three-month CDR (call detailed record) data of a city from China Mobile Node number: 723,201 Average degree: 13.4

Community distribution largest community size: 95,690

Experiments Top-k Nodes Mining Methods Parameter study: MixedGreedy Algorithm NewGreedy Algorithm DegreeDiscount Random Method CGA SPCGA Parameter study: k, diffusion speed λ, data size

Results Influence degree and time vs K

Results Influence degree and time vs diffusion speed λ the efficiency of MixedGreedy drops quickly (almost exponentially) while CGA is a lot better.

Results Influence degree and time vs network size the influence degree is relatively stable

Summary Handle large-scale networks (power-law distribution degree) improve the efficiency of existing algorithms by an order of magnitude while the loss in approximation precision is small Can combine with any existing algorithm to find influential nodes w.r.t. communities

Related work on Top-K Algorithm Typical Greedy Algorithm( Kempel et al. KDD2003) CELF Greedy Algorithm (Leskovec et al. KDD2007) An improved greedy algorithm (Kimura et al. AAAI2007) NewGreedy Algorithm, MixedGreedy, DegreeDiscount Algorithm (Chen et al. KDD2009) MIA algorithm (Chen et al. KDD2010) --None of them considers community property

Thank You !

Experiments Influence degree and time with different θ

Influential Model Based Community Detection Algorithm Community Combination denotes the influence degree of node u outside the community Cm Rm({u})denotes the influence degree of node v in its community Cm We expect that the difference between the node’s influence degree in its community and its influence degree in the whole network is small. To achieve a good set of top-K influential nodes with a good influence degree in our algorithm, we define combination entropy to measure the connections of two communities and combine two communities if the combination entropy between them is larger than a threshold. L[Cm] includes its influenced neighbors such that they will make diffusion degree of v with regard to Cm different from diffusion degree of v with regard to the whole network. We set a threshold θ. If the combination entropy CoEntropy (CElm) of community Cm to community Cl is bigger than θ, then Cm and Cl will be combined.

Problem Statement Influence Degree Given a mobile social network G = (V, E, W), we aim to mine a set of top-K influential nodes S on the network such that R(S) is maximized using the extended Independent Cascade information diffusion model.

Related work on Top-K Algorithm DegreeDiscount Algorithm NewGreedy Algorithm CELF Greedy Algorithm Typical Greedy Algorithm Chen et al. KDD2009 No precision guarantee O(KlogN+M) Chen et al. KDD2009 (1-1/e)-approximation O(KRM) (1-1/e)-approximation 700 times faster Leskovec et al. KDD2007 Kempel et al. KDD2003 (1-1/e)-approximation O(KNRM) --Using IC information diffusion model --None of them consider community property

Outline Research Background Related Work Preliminaries and Problem Statement Top-K Nodes Mining Algorithm Experiments