O N F LOW A UTHORITY D ISCOVERY IN S OCIAL N ETWORKS Arijit Khan, Xifeng Yan Computer Science University of California, Santa Barbara {arijitkhan,

Slides:



Advertisements
Similar presentations
A R EVISIT TO THE P RIMAL -D UAL B ASED C LOCK S KEW S CHEDULING A LGORITHM Min Ni and Seda Ogrenci Memik EECS Department, Northwestern University.
Advertisements

Social network partition Presenter: Xiaofei Cao Partick Berg.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
N EIGHBORHOOD B ASED F AST G RAPH S EARCH I N L ARGE N ETWORKS Arijit Khan, Nan Li, Xifeng Yan, Ziyu Guan Computer Science UC Santa Barbara {arijitkhan,
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Spread of Influence through a Social Network Adapted from :
Maximizing the Spread of Influence through a Social Network
DAVA: Distributing Vaccines over Networks under Prior Information
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
Maximizing the Spread of Influence through a Social Network
Least Cost Rumor Blocking in Social networks Lidan Fan Computer Science Department the University of Texas at Dallas.
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
CIKM’2008 Presentation Oct. 27, 2008 Napa, California
DATA MINING LECTURE 12 Link Analysis Ranking Random walks.
1 Maximal Independent Set. 2 Independent Set (IS): In a graph, any set of nodes that are not adjacent.
The Cat and The Mouse -- The Case of Mobile Sensors and Targets David K. Y. Yau Lab for Advanced Network Systems Dept of Computer Science Purdue University.
Estimating the Global PageRank of Web Communities Paper by Jason V. Davis & Inderjit S. Dhillon Dept. of Computer Sciences University of Texas at Austin.
Deployment of Surface Gateways for Underwater Wireless Sensor Networks Saleh Ibrahim Advising Committee Prof. Reda Ammar Prof. Jun-Hong Cui Prof. Sanguthevar.
Detecting Network Intrusions via Sampling : A Game Theoretic Approach Presented By: Matt Vidal Murali Kodialam T.V. Lakshman July 22, 2003 Bell Labs, Lucent.
Cache Placement in Sensor Networks Under Update Cost Constraint Bin Tang, Samir Das and Himanshu Gupta Department of Computer Science Stony Brook University.
Network Optimization Models: Maximum Flow Problems In this handout: The problem statement Solving by linear programming Augmenting path algorithm.
Maximizing the Spread of Influence through a Social Network
Lecture 8. Why do we need residual networks? Residual networks allow one to reverse flows if necessary. If we have taken a bad path then residual networks.
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
Active Learning for Networked Data Based on Non-progressive Diffusion Model Zhilin Yang, Jie Tang, Bin Xu, Chunxiao Xing Dept. of Computer Science and.
Distributed Constraint Optimization Michal Jakob Agent Technology Center, Dept. of Computer Science and Engineering, FEE, Czech Technical University A4M33MAS.
Personalized Influence Maximization on Social Networks
Efficient Gathering of Correlated Data in Sensor Networks
Mehdi Kargar Aijun An York University, Toronto, Canada Discovering Top-k Teams of Experts with/without a Leader in Social Networks.
June 21, 2007 Minimum Interference Channel Assignment in Multi-Radio Wireless Mesh Networks Anand Prabhu Subramanian, Himanshu Gupta.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Diversified Top-k Graph Pattern Matching 1 Yinghui Wu UC Santa Barbara Wenfei Fan University of Edinburgh Southwest Jiaotong University Xin Wang.
Clustering Spatial Data Using Random Walk David Harel and Yehuda Koren KDD 2001.
Researchers: Preet Bola Mike Earnest Kevin Varela-O’Hara Han Zou Advisor: Walter Rusin Data Storage Networks.
December 7-10, 2013, Dallas, Texas
Jing (Selena) He and Hisham M. Haddad Department of Computer Science, Kennesaw State University Shouling Ji, Xiaojing Liao, and Raheem Beyah School of.
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Eva Tardos Cornell University KDD 2003.
Register Placement for High- Performance Circuits M. Chiang, T. Okamoto and T. Yoshimura Waseda University, Japan DATE 2009.
Finding Top-k Shortest Path Distance Changes in an Evolutionary Network SSTD th August 2011 Manish Gupta UIUC Charu Aggarwal IBM Jiawei Han UIUC.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
Online Social Networks and Media
1 Network Models Transportation Problem (TP) Distributing any commodity from any group of supply centers, called sources, to any group of receiving.
I NFORMATION C ASCADE Priyanka Garg. OUTLINE Information Propagation Virus Propagation Model How to model infection? Inferring Latent Social Networks.
Mining Top-K Large Structural Patterns in a Massive Network Feida Zhu 1, Qiang Qu 2, David Lo 1, Xifeng Yan 3, Jiawei Han 4, and Philip S. Yu 5 1 Singapore.
Manuel Gomez Rodriguez Bernhard Schölkopf I NFLUENCE M AXIMIZATION IN C ONTINUOUS T IME D IFFUSION N ETWORKS , ICML ‘12.
IMRank: Influence Maximization via Finding Self-Consistent Ranking
Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
1 Using Network Coding for Dependent Data Broadcasting in a Mobile Environment Chung-Hua Chu, De-Nian Yang and Ming-Syan Chen IEEE GLOBECOM 2007 Reporter.
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1
Wenyu Zhang From Social Network Group
Nanyang Technological University
Finding Dense and Connected Subgraphs in Dual Networks
Independent Cascade Model and Linear Threshold Model
Greedy & Heuristic algorithms in Influence Maximization
Friend Recommendation with a Target User in Social Networking Services
Towards Effective Partition Management for Large Graphs
Independent Cascade Model and Linear Threshold Model
The Importance of Communities for Learning to Influence
Effective Social Network Quarantine with Minimal Isolation Costs
A History Sensitive Cascade Model in Diffusion Networks
Self-protection experiment
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
Presentation transcript:

O N F LOW A UTHORITY D ISCOVERY IN S OCIAL N ETWORKS Arijit Khan, Xifeng Yan Computer Science University of California, Santa Barbara {arijitkhan, Charu C. Aggarwal IBM T.J. Watson Research Center, Hawthorne, New York

Charu C. Aggarwal, Arijit Khan and Xifeng Yan M OTIVATION  Online Marketing via “word-of- mouth” recommendations.  Find a small subset of influential individuals in a social network, such that they can influence the largest number of people in the network. 2

Charu C. Aggarwal, Arijit Khan and Xifeng Yan M OTIVATION 3  Fast and widespread information cascade, i.e., with the use of Facebook and Twitter, the event “2011 Egyptian Protest” quickly reached to the protestors worldwide. Influence Propagation in Social Network

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 4 R OADMAP  Problem Formulation  Related Work  Algorithm Ranked Replace Bayes Traceback  Restricted Source and Targets  Experimental Results  Conclusion

Charu C. Aggarwal, Arijit Khan and Xifeng Yan  Directed Graph G (V, E, P).  P : E  {0,1} ; probability of information cascade through a directed edge.  Let p ij be the probability of information cascade along directed edge e ij. Then, P = [p ij ].  If r i be the probability that a given node i contains an information, then it eventually transmits the information to adjacent node j with probability (r i p ij ). 5 P ROBLEM F ORMULATION p ij ij riri ij riri 1-p ij ij 1-r i Influence Cascade Model

Charu C. Aggarwal, Arijit Khan and Xifeng Yan  Let be the steady state probability that node i assimilates the information.  S is the initial set of seed nodes, where the information was exposed. 6 P ROBLEM D EFINITION Influence Cascade Model  Problem Definition: Given the budget constraint k, determine the set S of k nodes which maximizes the total aggregate flow p li

Charu C. Aggarwal, Arijit Khan and Xifeng Yan R OADMAP  Problem Formulation  Related Work  Algorithm - Ranked Replace - Bayes Traceback  Restricted Source and Targets  Experimental Results  Conclusion

Charu C. Aggarwal, Arijit Khan and Xifeng Yan  Kempe, Kleinberg, Tardos. KDD ‘03: Linear Threshold Model – o A node gets activated at time t if more than a certain fraction of its neighbors were active at time t-1. Independent Cascade Model o Each newly active node i gets a single chance to activate its inactive neighbor node j and succeed with probability p ij. o Greedily select the best possible seed node given the already selected seed nodes.  Chen, Wang, Yang. KDD ‘09: Degree Discount Independent Cascade Model.  Wang, Kong, Song, Xie. KDD ‘10: Community Based Greedy Algorithm for Influential Nodes Detection.  Lappas, Terzi, Gunopulos, Mannila. KDD ‘10: K-effectors that maximizes influence on a given set of nodes and minimizes the influence outside the set. 8 R ELATED W ORK

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 9 R OADMAP  Problem Formulation  Related Work  Algorithm - Ranked Replace - Bayes Traceback  Restricted Source and Targets  Experimental Results  Conclusion

Charu C. Aggarwal, Arijit Khan and Xifeng Yan  Iterative and heuristic technique.  Initialization: - Calculate the steady state flow (SSF) by each node u in V, which is defined as the aggregate flow generated by node u individually. SSF(u) = ; when S = {u}. - Sort all nodes in V in descending order of their steady state flow.  Preliminary Seed Selection: - Select the k nodes with highest SSF values as the preliminary seed nodes in S. 10 R ANKED R EPLACE A LGORITHM

Charu C. Aggarwal, Arijit Khan and Xifeng Yan  Iterative Improvement of Seed Nodes: - Replace some node in S with a node in (V-S), if that increases the total aggregate flow. - The seed nodes in S are replaced in increasing order of their SSF values. - The nodes from (V-S) are selected in decreasing order of their SSF values. - If r successive attempts of replacement do not increase the aggregate flow, terminate and return S. 11 R ANKED R EPLACE A LGORITHM (C ONTINUED ) S V-S SSF

Charu C. Aggarwal, Arijit Khan and Xifeng Yan  Each iteration of Ranked Replace technique requires a lot of computation O(t.|E|); where t is the number of iterations required to get steady state probabilities.  Number of iterations required for convergence of Ranked Replace can be very large O(|V|).  Slow !!! 12 P ROBLEM WITH R ANKED R EPLACE

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 13 B AYES T RACEBACK A LGORITHM  An information is viewed as a packet.  The packet at a node j is inherited from one of its incoming nodes i with probability proportional to p ij following a random walk.  There is a single information packet, which is (stochastically) present only at one node at a time S Bayes Traceback Model  Expose the information packet to one of the k seed nodes.  The token will visit the nodes in the network following random walk. Thus, it can visit a node multiple times.

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 14 B AYES T RACEBACK M ODEL (C ONTINUED )  Transient State – Each node in the graph has equal probability of having the packet.  The even spread of information may not be possible in steady- state, however our goal is to create an evenly spread probability distribution as an intermediate transient after a small number of iterations following the random walk.  Identify k seed nodes, so that an intermediate transient state is reached as quickly as possible.  Intuitively, these k nodes correspond to the seed nodes which result in maximum aggregate flow in the network.

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 15 B AYES T RACEBACK A LGORITHM  Starting from the transient state at t=0, trace back the previous states using Bayes Algorithm.  Q -t (i) = probability that node i has the information packet at time t.  At each iteration, delete a fraction of nodes with low probabilities of having the information packet. Iterate until end up with k nodes.  Q -t (B)=0.5 Q -t (C)=0.3  Q -(t+1) (A) = 0.5*0.3/( ) + 0.3*1.0/( ) = A B C Bayes Traceback Method

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 16 R UNNING T IME OF B AYES T RACEBACK  Each iteration of Bayes Traceback has complexity O(|E|).  If we delete f fraction of the remaining nodes in each iteration, the number of iterations required by Bayes Traceback method is given by log(n/k)/log(1/(1-f)).  Fast !!!

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 17 R OADMAP  Problem Formulation  Related Work  Algorithm - Ranked Replace - Bayes Traceback  Restricted Source and Targets  Experimental Results  Conclusion

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 18 R ESTRICTED S OURCE AND T ARGETS  Restricted Targets: maximize the flow in a given set of target nodes, although the entire graph structure can be used.  Restricted Source: The initial k seed nodes can be selected only among a given set of candidate nodes.  Solutions to both problems are straightforward for Ranked Replace algorithm.  For Restricted source problem in Bayes Traceback method, delete nodes until k nodes are left from the given set of candidate nodes.

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 19 R ESTRICTED S OURCE AND T ARGETS (C ONTINUED )  For Restricted target problem in Bayes Traceback method, the target nodes are considered as sink nodes; i.e., we do not propagate the flow from target node to non-target node, but we propagate flow from non-target to target sets A B C  Q -t (B)=0.5 Q -t (C)=0.3  Q -(t+1) (A) = 0.5*0.3/( ) + 0.3*1.0/( ) = 0.1 Bayes Traceback with Restricted Target

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 20 R OADMAP  Problem Formulation  Algorithm - Ranked Replace - Bayes Traceback  Restricted Source and Targets  Experimental Results  Conclusion

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 21  Data Sets :  Top-5 Flow Authorities in DBLP: E XPERIMENTAL R ESULTS # of Node# of Edges Last.FM818,8003,340,954 DBLP684,9117,764,604 Twitter1,194,0926,450,193 Ranked ReplaceBayes TracebackPeer InfluenceDegree Discount IC Wen Gao Luigi FortunaWei Li Francky CatthorPhilip S YuDipanwita R. C.Wei Wang Philip S YuM T KandemirTimothy SullivanLi Zhang M T KandemirFrancky CatthorWei LiIan T Foster A L S Vincentelli S C LinWei Zhang

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 22 E FFECTIVENESS R ESULTS Effectiveness Results (DBLP)  k = # flow authority nodes

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 23 E FFICIENCY R ESULTS Efficiency Results (DBLP)  k = # flow authority nodes

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 24 R OADMAP  Problem Formulation  Related Work  Algorithm - Ranked Replace - Bayes Traceback  Restricted Source and Targets  Experimental Results  Conclusion

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 25 C ONCLUSION  Novel algorithms for the determination of optimal flow authorities in social networks.  Empirically outperform the existing algorithms for optimal flow authority detection in graphs.  Can be easily extended to the restricted source and target set problems.  How to modify the algorithms in the presence of negative information flows?

Charu C. Aggarwal, Arijit Khan and Xifeng Yan 26