Greedy & Heuristic algorithms in Influence Maximization

Slides:



Advertisements
Similar presentations
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Advertisements

~1~ Infocom’04 Mar. 10th On Finding Disjoint Paths in Single and Dual Link Cost Networks Chunming Qiao* LANDER, CSE Department SUNY at Buffalo *Collaborators:
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Spread of Influence through a Social Network Adapted from :
Maximizing the Spread of Influence through a Social Network
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
Maximizing the Spread of Influence through a Social Network
Suqi Cheng Research Center of Web Data Sciences & Engineering
In Search of Influential Event Organizers in Online Social Networks
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
Based on “Cascading Behavior in Networks: Algorithmic and Economic Issues” in Algorithmic Game Theory (Jon Kleinberg, 2007) and Ch.16 and 19 of Networks,
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Finding Effectors in Social Networks T. Lappas (UCR), E. Terzi (BU), D. Gunopoulos (UoA), H. Mannila (Aalto U.) Presented by: Eric Gavaletz 04/26/2011.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Influence Maximization
Maximal Independent Set Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Thang N. Dinh, Dung T. Nguyen, My T. Thai Dept. of Computer & Information Science & Engineering University of Florida, Gainesville, FL Hypertext-2012,
DATA MINING LECTURE 13 Pagerank, Absorbing Random Walks Coverage Problems.
December 7-10, 2013, Dallas, Texas
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Eva Tardos Cornell University KDD 2003.
Graph-based Text Classification: Learn from Your Neighbors Ralitsa Angelova , Gerhard Weikum : Max Planck Institute for Informatics Stuhlsatzenhausweg.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
Online Social Networks and Media
Lecture 3-1 Independent Cascade Weili Wu Ding-Zhu Du University of Texas at Dallas.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
LOCALIZED MINIMUM - ENERGY BROADCASTING IN AD - HOC NETWORKS Paper By : Julien Cartigny, David Simplot, And Ivan Stojmenovic Instructor : Dr Yingshu Li.
1 Latency-Bounded Minimum Influential Node Selection in Social Networks Incheol Shin
IMRank: Influence Maximization via Finding Self-Consistent Ranking
Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1
More NP-Complete and NP-hard Problems
Cohesive Subgraph Computation over Large Graphs
Seed Selection.
Wenyu Zhang From Social Network Group
Nanyang Technological University
Finding Dense and Connected Subgraphs in Dual Networks
Independent Cascade Model and Linear Threshold Model
Approximation algorithms
Maximal Independent Set
Lectures on Network Flows
Friend Recommendation with a Target User in Social Networking Services
Independent Cascade Model and Linear Threshold Model
Maximizing the Spread of Influence through a Social Network
The Importance of Communities for Learning to Influence
Effective Social Network Quarantine with Minimal Isolation Costs
Maximal Independent Set
3.5 Minimum Cuts in Undirected Graphs
On the effect of randomness on planted 3-coloring models
Algorithms for Budget-Constrained Survivable Topology Design
Cost-effective Outbreak Detection in Networks
Richard Anderson Autumn 2016 Lecture 7
Bharathi-Kempe-Salek Conjecture
Lecture 14 Shortest Path (cont’d) Minimum Spanning Tree
CSE 373: Data Structures and Algorithms
Alan Kuhnle*, Victoria G. Crawford, and My T. Thai
Viral Marketing over Social Networks
Discovering Influential Nodes From Social Trust Network
Independent Cascade Model and Linear Threshold Model
Lecture 13 Shortest Path (cont’d) Minimum Spanning Tree
Lecture 2-6 Complexity for Computing Influence Spread
Presentation transcript:

Greedy & Heuristic algorithms in Influence Maximization Jingtao Zhu May 13rd,2016

“Efficient Influence Maximization in Social Networks” Written by Chen Wei, Yajun Wang, and Siyu Yang. Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 2009.

Scenery Market a cool online application through social network; limited budget- only select a small number of initial users; The company wishes that through the word-of-mouth effect a large population in the social network would adopt the application. The problem is whom to select as the initial users so that they eventually influence the largest number of people in the network, i.e., the problem of finding influential individuals in a social network.

Problem Description Find a small subset of nodes in a social network that could maximize the spread of influence. Kempe et al. prove that the optimization problem is NP-hard, and present a greedy approximation algorithm applicable to all three models. They also show through experiments that their greedy algorithm significantly outperforms the classic degree and centrality-based heuristics in influence spread. However, their algorithm has a serious drawback, which is its efficiency.

Two proposed solutions 1. Improved the original greedy algorithm, further reduce its running time. 2.Proposed new degree discount heuristics.

Solution I 1. Original greedy algorithm General idea:In each round i, the algorithm adds one vertex into the selected set S such that this vertex together with current set S maximizes the influence spread (Line 10). Equivalently, this means that the vertex selected in round i is the one that maximizes the incremental influence spread in this round. To do so, for each vertex v ∈V/ S, the influence spread of S ∪ {v} is estimated with R repeated simulations of RanCas(S ∪ {v}) (Lines 3–9). Each calculation of RanCas(S) takes O(m) time, and thus takes O(knRm) time to complete.

How to reduce the running time? CELF optimization Based on submodularity of the influence maximization objective. Submodularity property:when adding a vertex v into a seed set S, the incremental influence spread as the result of adding v is larger if S is smaller. In each round the incremental influence spread of a large number of nodes do not need to be re-evaluated because their values in the previous round are already less than that of some other node evaluated in the current round. 700 times faster

Improved Greedy :Independent Cascade Model

2.Improved Greedy: Independent cascade model For Independent Cascade:Each node may be active or inactive;Time proceeds at discrete time-steps. At time t, every node v that became active in time t-1 actives a non-active neighbor w with probability Puv. If it fails, it does not try again.The same as the simple SIR model. 1.construct a graph G’ 2.Obtain G’ by removing all edges not for propagation from G with Pr(1-p) 3.Use DFS/BFS to find out the set of vertices reachable from S in G’ Same influence spread with 15-34% shorter run time

Weighted cascade model The probability of u activating v is usually not the same as the probability of v activating u. Because of this, we build a directed graph G’ = (V, E’), in which each edge is replaced by two directed edges u~v and v~u. We still use dv to denote the degree of v in the original graph. The same idea from the IC model, in each round of the greedy algorithm when selecting a new vertex to be added into the existing seed set S, we generate R random directed graphs G0 = RanWC(Gˆ). For each vertex v and each graph G0, we want to compute |RG0 (S [ {v})|, and then average among all G0 to obtain the influence spread of S {v} and select v that maximizes this value.

How to solve the running time for WC In the IC model, it takes O(m) time total since G′ is an undirected graph. In the WC model, G′is a directed graph, making the algorithm non-trivial. A straightforward implementation using BFS from all vertices take O(mn) time , which is not as good as O(mn) for sparse graphs such as social network graphs. Solution: adapt the randomized algorithm of Cohen for estimating the number of all reachable vertices from every vertex.

Cohen’s algorithm

3.Mixed-Greedy WC algorithm First round uses NewGreedyWC algorithm ; the remaining rounds use the CELF optimization.

Solution II: Degree Discount Heuristics let v be a neighbor of vertex u. If u has been selected as a seed, then when considering selecting v as a new seed based on its degree, we should not count the edge vu towards its degree. Thus we discount v’s degree by one due to the presence of u in the seed set, and we do the same discount on v’s degree for every neighbor of v that is already in the seed set. This is a basic degree discount heuristic applicable to all cascade models,

For a vertex v with tv neighbors already selected as seeds, we should discount v’s degree by For example, for a node v with dv = 200, tv = 1, and p = 0.01 (parameters similar to our experimental graphs), we should discount v’s degree to about 196.

Assumptions and Limits 1.Not consider other factors, such as indirect influence effects and selected seeds affecting the neighbors of v. 2.Suppose the difference of those effects between the case tv = 0 and tv > 0 is negligible for small p.

Algorithms evaluation Large academic collaboration graphs from online archival database arXiv.org. Co-author network: author-node edge: two authors collaborated The first network is from the "High Energy Physics - Theory" section with papers form 1991 to 2003, which contains n = 15, 233 nodes and m=58,891edges. The second network is from the full paperlist of the "Physics" section, denoted as NetPHY, which contains n = 37,154nodes and m =231,584edges.

Results SingleDiscount heuristic, although just a simple adjustment to the Degree heuristic, reduced approximately half of the gap between Greedy and Degree. NewGreedyIC and MixedGreedyIC essentially matches CELF- Greedy on both graphs. DegreeDiscountIC heuristic performs extremely well.

Summary Greedy running time is shortened by mixed greedy alg. Digree discount Heuristic >> classic heuristic Heuristic > greedy

The current influence maximization problem is simplified, without considering other features in the social networks, such as indirect influence effects and selected seeds affecting the neighbors of vertices.

Thanks!