Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research.

Slides:



Advertisements
Similar presentations
Amit Goyal Laks V. S. Lakshmanan RecMax: Exploiting Recommender Systems for Fun and Profit University of British Columbia
Advertisements

Viral Marketing – Learning Influence Probabilities.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Fast Algorithms For Hierarchical Range Histogram Constructions
Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Minimizing Seed Set for Viral Marketing Cheng Long & Raymond Chi-Wing Wong Presented by: Cheng Long 20-August-2011.
Spread of Influence through a Social Network Adapted from :
Cost-effective Outbreak Detection in Networks Jure Leskovec, Andreas Krause, Carlos Guestrin, Christos Faloutsos, Jeanne VanBriesen, Natalie Glance.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.
Maximizing the Spread of Influence through a Social Network
Business Statistics for Managerial Decision
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
1 Social Influence Analysis in Large-scale Networks Jie Tang 1, Jimeng Sun 2, Chi Wang 1, and Zi Yang 1 1 Dept. of Computer Science and Technology Tsinghua.
Active Learning and Collaborative Filtering
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
Discovering Leaders from Community Actions Amit Goyal 1 Francesco Bonchi 2 Laks V.S. Lakshmanan 1 Oct 27,
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Discovering Leaders from Community Actions Presenter : Wu, Jia-Hao Authors : Amit Goyal, Francesco Bonchi,
Identifying Early Buyers from Purchase Data Paat Rusmevichientong, Shenghuo Zhu & David Selinger Presented by: Vinita Shinde Feb 18 th, 2010.
Model Evaluation Metrics for Performance Evaluation
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Evaluating Hypotheses
ROC Curves.
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
A Search-based Method for Forecasting Ad Impression in Contextual Advertising Defense.
Simpath: An Efficient Algorithm for Influence Maximization under Linear Threshold Model Amit Goyal Wei Lu Laks V. S. Lakshmanan University of British Columbia.
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
Personalized Influence Maximization on Social Networks
Link Reconstruction from Partial Information Gong Xiaofeng, Li Kun & C. H. Lai
Modeling Relationship Strength in Online Social Networks Rongjing Xiang: Purdue University Jennifer Neville: Purdue University Monica Rogati: LinkedIn.
Chapter 8: Confidence Intervals
Chapter 10. Sampling Strategy for Building Decision Trees from Very Large Databases Comprising Many Continuous Attributes Jean-Hugues Chauchat and Ricco.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Information Spread and Information Maximization in Social Networks Xie Yiran 5.28.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Jennifer Lewis Priestley Presentation of “Assessment of Evaluation Methods for Prediction and Classification of Consumer Risk in the Credit Industry” co-authored.
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
Model Evaluation l Metrics for Performance Evaluation –How to evaluate the performance of a model? l Methods for Performance Evaluation –How to obtain.
Evaluation of Recommender Systems Joonseok Lee Georgia Institute of Technology 2011/04/12 1.
Online Social Networks and Media
Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis Kei Hashimoto, Yoshihiko Nankaku, and Keiichi.
- 1 - Overall procedure of validation Calibration Validation Figure 12.4 Validation, calibration, and prediction (Oberkampf and Barone, 2004 ). Model accuracy.
Bayesian Speech Synthesis Framework Integrating Training and Synthesis Processes Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda Nagoya Institute.
Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.
1 Finding Spread Blockers in Dynamic Networks (SNAKDD08)Habiba, Yintao Yu, Tanya Y., Berger-Wolf, Jared Saia Speaker: Hsu, Yu-wen Advisor: Dr. Koh, Jia-Ling.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Speaker : Yu-Hui Chen Authors : Dinuka A. Soysa, Denis Guangyin Chen, Oscar C. Au, and Amine Bermak From : 2013 IEEE Symposium on Computational Intelligence.
Learning Photographic Global Tonal Adjustment with a Database of Input / Output Image Pairs.
Sporadic model building for efficiency enhancement of the hierarchical BOA Genetic Programming and Evolvable Machines (2008) 9: Martin Pelikan, Kumara.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 Discriminative Frequent Pattern Analysis for Effective Classification Presenter: Han Liang COURSE PRESENTATION:
Biao Wang 1, Ge Chen 1, Luoyi Fu 1, Li Song 1, Xinbing Wang 1, Xue Liu 2 1 Shanghai Jiao Tong University 2 McGill University
Inferring Networks of Diffusion and Influence
Nanyang Technological University
Independent Cascade Model and Linear Threshold Model
Learning Deep Generative Models by Ruslan Salakhutdinov
KDD 2004: Adversarial Classification
Center for Complexity in Business, R. Smith School of Business
Learning Influence Probabilities In Social Networks
Independent Cascade Model and Linear Threshold Model
The Importance of Communities for Learning to Influence
Discriminative Frequent Pattern Analysis for Effective Classification
Cost-effective Outbreak Detection in Networks
Discovering Leaders from Community Actions
Independent Cascade Model and Linear Threshold Model
More on Maxent Env. Variable importance:
Presentation transcript:

Learning Influence Probabilities in Social Networks 1 2 Amit Goyal 1 Francesco Bonchi 2 Laks V. S. Lakshmanan 1 U. of British Columbia Yahoo! Research U. of British Columbia

Word of Mouth and Viral Marketing  We are more influenced by our friends than strangers  68% of consumers consult friends and family before purchasing home electronics (Burke 2003) of British Columbia, Yahoo! Research

Viral Marketing  Also known as Target Advertising  Initiate chain reaction by Word of mouth effect  Low investments, maximum gain of British Columbia, Yahoo! Research

Viral Marketing as an Optimization Problem  Given: Network with influence probabilities  Problem: Select top-k users such that by targeting them, the spread of influence is maximized  Domingos et al 2001, Richardson et al 2002, Kempe et al of British Columbia, Yahoo! Research  How to calculate true influence probabilities?

Some Questions  Where do those influence probabilities come from? Available real world datasets don’t have prob.!  Can we learn those probabilities from available data?  Previous Viral Marketing studies ignore the effect of time. How can we take time into account?  Do probabilities change over time? Can we predict time at which user is most likely to perform an action.  What users/actions are more prone to influence? of British Columbia, Yahoo! Research

Input Data  We focus on actions.  Input: Social Graph: P and Q become friends at time 4. Action log: User P performs actions a1 at time unit 5. UserActionTime Pa15 Q 10 Ra115 Qa212 Ra214 Ra36 P 14 of British Columbia, Yahoo! Research

Our contributions (1/2)  Propose several probabilistic influence models between users. Consistent with existing propagation models.  Develop efficient algorithms to learn the parameters of the models.  Able to predict whether a user perform an action or not.  Predict the time at which she will perform it. 7

Our Contributions (2/2)  Introduce metrics of users and actions influenceability. High values => genuine influence.  Validated our models on Flickr. of British Columbia, Yahoo! Research

Overview  Input: Social Graph: P and Q become friends at time 4. Action log: User P performs actions a1 at time unit 5. UserActionTime Pa15 Q 10 Ra115 Qa212 Ra214 Ra36 P 14 Influence Models Q R P of British Columbia, Yahoo! Research

Background

General Threshold (Propagation) Model  At any point of time, each node is either active or inactive.  More active neighbors => u more likely to get active.  Notations: S = {active neighbors of u}. p u (S) : Joint influence probability of S on u. Θ u : Activation threshold of user u.  When p u (S) >= Θ u, u becomes active. of British Columbia, Yahoo! Research

General Threshold Model - Example Inactive Node Active Node Threshold Joint Influence Probability Source: David Kempe’s slides v w Stop! U of British Columbia, Yahoo! Research x

Our Framework

Solution Framework  Assuming independence, we define p v,u : influence probability of user v on user u  Consistent with the existing propagation models – monotonocity, submodularity.  It is incremental. i.e. can be updated incrementally using  Our aim is to learn p v,u for all edges. of British Columbia, Yahoo! Research

Influence Models  Static Models Assume that influence probabilities are static and do not change over time.  Continuous Time (CT) Models Influence probabilities are continuous functions of time. Not incremental, hence very expensive to apply on large datasets.  Discrete Time (DT) Models Approximation of CT models. Incremental, hence efficient. of British Columbia, Yahoo! Research

Static Models  4 variants Bernoulli as running example.  Incremental hence most efficient.  We omit details here of British Columbia, Yahoo! Research

Time Conscious Models  Do influence probabilities remain constant independently of time?  We propose Continuous Time (CT) Model Based on exponential decay distribution NO of British Columbia, Yahoo! Research

Continuous Time Models  Best model.  Capable of predicting time at which user is most likely to perform the action.  Not incremental Discrete Time Model  Based on step time functions  Incremental of British Columbia, Yahoo! Research

Evaluation Strategy (1/2)  Split the action log data into training (80%) and testing (20%). User “James” have joined “Whistler Mountain” community at time 5.  In testing phase, we ask the model to predict whether user will become active or not Given all the neighbors who are active Binary Classification University of British Columbia, Yahoo! Research

Evaluation Strategy (2/2)  We ignore all the cases when none of the user’s friends is active As then the model is inapplicable.  We use ROC (Receiver Operating Characteristics) curves True Positive Rate (TPR) vs False Positive Rate (FPR). TPR = TP/P FPR = FP/N Reality Prediction ActiveInactive ActiveTPFP InactiveFNTN TotalPN Operating Point Ideal Point of British Columbia, Yahoo! Research

Algorithms  Special emphasis on efficiency of applying/testing the models. Incremental Property  In practice, action logs tend to be huge, so we optimize our algorithms to minimize the number of scans over the action log. Training: 2 scans to learn all models simultaneously. Testing: 1 scan to test one model at a time. of British Columbia, Yahoo! Research

Experimental Evaluation

Dataset  Yahoo! Flickr dataset  “Joining a group” is considered as action User “James” joined “Whistler Mountains” at time 5.  #users ~ 1.3 million  #edges ~ 40.4 million  Degree:  #groups/actions ~ 300K  #tuples in action log ~ 35.8 million of British Columbia, Yahoo! Research

Comparison of Static, CT and DT models  Time conscious Models are better than Static Models.  CT and DT models perform equally well. of British Columbia, Yahoo! Research

Runtime  Static and DT models are far more efficient compared to CT models because of their incremental nature. Testing of British Columbia, Yahoo! Research

Predicting Time – Distribution of Error  Operating Point is chosen corresponding to TPR: 82.5%, FPR: 17.5%.  X-axis: error in predicting time (in weeks)  Y-axis: frequency of that error  Most of the time, error in the prediction is very small of British Columbia, Yahoo! Research

Predicting Time – Coverage vs Error  Operating Point is chosen corresponding to TPR: 82.5%, FPR: 17.5%.  A point (x,y) here means for y% of cases, the error is within  In particular, for 95% of the cases, the error is within 20 weeks. of British Columbia, Yahoo! Research

User Influenceability  Users with high influenceability => easier prediction of influence => more prone to viral marketing campaigns.  Some users are more prone to influence propagation than others.  Learn from Training data of British Columbia, Yahoo! Research

Action Influenceability  Some actions are more prone to influence propagation than others.  Actions with high user influenceability => easier prediction of influence => more suitable to viral marketing campaigns. of British Columbia, Yahoo! Research

Related Work  Independently, Saito et al (KES 2008) have studied the same problem Focus on Independent Cascade Model of propagation. Apply Expectation Maximization (EM) algorithm. Not scalable to huge datasets like the one we are dealing in this work. of British Columbia, Yahoo! Research

Other applications of Influence Propagations  Personalized Recommender Systems Song et al 2006, 2007  Feed Ranking Samper et al 2006  Trust Propagation Guha et al 2004, Ziegler et al 2005, Golbeck et al 2006, Taherian et al of British Columbia, Yahoo! Research

Conclusions (1/2)  Previous works typically assume influence probabilities are given as input.  Studied the problem of learning such probabilities from a log of past propagations.  Proposed both static and time-conscious models of influence.  We also proposed efficient algorithms to learn and apply the models. of British Columbia, Yahoo! Research

Conclusions (2/2)  Using CT models, it is possible to predict even the time at which a user will perform it with a good accuracy.  Introduce metrics of users and actions influenceability. High values => easier prediction of influence. Can be utilized in Viral Marketing decisions. of British Columbia, Yahoo! Research

Future Work  Learning optimal user activation thresholds.  Considering users and actions influenceability in the theory of Viral Marketing.  Role of time in Viral Marketing. of British Columbia, Yahoo! Research

Thanks!! of British Columbia, Yahoo! Research

Predicting Time  CT models can predict the time interval [b,e] in which she is most likely to perform the action. 0 Joint influence Probability of u getting active ΘuΘu Time ->  is half life period  Tightness of lower bounds not critical in Viral Marketing Applications.  Experiments on the upper bound e. of British Columbia, Yahoo! Research

Predicting Time - RMSE vs Accuracy  CT models can predict the time interval [b,e] in which user is most likely to perform the action. Experiments only on upper bound e.  Accuracy =  RMSE = root mean square error in days  RMSE ~ days of British Columbia, Yahoo! Research

Static Models – Jaccard Index  Jaccard Index is often used to measure similarity b/w sample sets.  We adapt it to estimate p v,u of British Columbia, Yahoo! Research

Partial Credits (PC)  Let, for an action, D is influenced by 3 of its neighbors.  Then, 1/3 credit is given to each one of these neighbors. AB D 1/3 C PC Bernoulli PC Jaccard of British Columbia, Yahoo! Research

Learning the Models  Parameters to learn: #actions performed by each user – A u #actions propagated via each edge – A v2u Mean life time – Pa15 Q 10 Ra115 Qa212 Ra214 Ra36 P 14 uAuAu P Q R PQR PX Q0,0X R X Input ,01,50,0 1, ,01, ,0 1,8 of British Columbia, Yahoo! Research

Propagation Models  Threshold Models Linear Threshold Model General Threshold Model  Cascade Models Independent Cascade Model Decreasing Cascade Model of British Columbia, Yahoo! Research

Properties of Diffusion Models  Monotonocity  Submodularity – Law of marginal Gain  Incrementality (Optional) can be updated incrementally using of British Columbia, Yahoo! Research

Comparison of 4 variants ROC comparison of 4 variants of Static Models ROC comparison of 4 variants of Discrete Time (DT) Models  Bernoulli is slightly better than Jaccard  Among two Bernoulli variants, Partial Credits (PC) wins by a small margin. of British Columbia, Yahoo! Research

Discrete Time Models  Approximation of CT Models  Incremental, hence efficient  4-variants corresponding to 4 Static Models 0 0 Influence prob. of v on u DT Model Time -> CT Model Influence prob. of v on u of British Columbia, Yahoo! Research

Overview  Context and Motivation  Background  Our Framework  Algorithms  Experiments  Related Work  Conclusions of British Columbia, Yahoo! Research

Continuous Time Models  Joint influence probability  Individual probabilities – exponential decay : maximum influence probability of v on u : the mean life time. of British Columbia, Yahoo! Research

Algorithms  Training – All models simultaneously in no more than 2 scans of training sub-set (80% of total) of action log table.  Testing – One model requires only one scan of testing sub-set (20% of total) of action log table.  Due to the lack of time, we omit the details of the algorithms. of British Columbia, Yahoo! Research