COMP 621U Week 3 Social Influence and Information Diffusion

Slides:



Advertisements
Similar presentations
1 Adaptive Submodularity: A New Approach to Active Learning and Stochastic Optimization Joint work with Andreas Krause 1 Daniel Golovin.
Advertisements

How to Schedule a Cascade in an Arbitrary Graph F. Chierchetti, J. Kleinberg, A. Panconesi February 2012 Presented by Emrah Cem 7301 – Advances in Social.
LEARNING INFLUENCE PROBABILITIES IN SOCIAL NETWORKS Amit Goyal Francesco Bonchi Laks V. S. Lakshmanan University of British Columbia Yahoo! Research University.
Spread of Influence through a Social Network Adapted from :
Maximizing the Spread of Influence through a Social Network
Online Social Networks and Media Cascading Behavior in Networks Epidemic Spread Influence Maximization.
The Structure of Networks with emphasis on information and social networks T-214-SINE Summer 2011 Chapter 16 Ýmir Vigfússon.
School of Computer Science Carnegie Mellon University 1 The dynamics of viral marketing Jure Leskovec, Carnegie Mellon University Lada Adamic, University.
Information Networks Failures and Epidemics in Networks Lecture 12.
Maximizing the Spread of Influence through a Social Network
Daphne Raban & Hila Koren University of Haifa, Graduate School of Management Is Reinvention of Information a Catalyst for Critical Mass Formation?
Nodes, Ties and Influence
Maximizing the Spread of Influence through a Social Network By David Kempe, Jon Kleinberg, Eva Tardos Report by Joe Abrams.
Based on “Cascading Behavior in Networks: Algorithmic and Economic Issues” in Algorithmic Game Theory (Jon Kleinberg, 2007) and Ch.16 and 19 of Networks,
On the Spread of Viruses on the Internet Noam Berger Joint work with C. Borgs, J.T. Chayes and A. Saberi.
Empirical analysis of social recommendation systems Review of paper by Ophir Gaathon Analysis of Social Information Networks COMS , Spring 2011,
1 Epidemic Spreading in Real Networks: an Eigenvalue Viewpoint Yang Wang Deepayan Chakrabarti Chenxi Wang Christos Faloutsos.
INFERRING NETWORKS OF DIFFUSION AND INFLUENCE Presented by Alicia Frame Paper by Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Kraus.
Social Learning. A Guessing Game Why are Wolfgang Puck restaurants so crowded? Why do employers turn down promising job candidates on the basis of rejections.
Inductive Reasoning Bayes Rule. Urn problem (1) A B A die throw determines from which urn to select balls. For outcomes 1,2, and 3, balls are picked from.
Ka-fu Wong © 2004 ECON1003: Analysis of Economic Data Lesson6-1 Lesson 6: Sampling Methods and the Central Limit Theorem.
Maximizing Product Adoption in Social Networks
Models of Influence in Online Social Networks
Online Social Networks and Media Epidemics and Influence.
Chapter 5 Sampling Distributions
Personalized Influence Maximization on Social Networks
Online Social Networks and Media
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
Λ14 Διαδικτυακά Κοινωνικά Δίκτυα και Μέσα Cascading Behavior in Networks Chapter 19, from D. Easley and J. Kleinberg.
V5 Epidemics on networks
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
1 1 Stanford University 2 MPI for Biological Cybernetics 3 California Institute of Technology Inferring Networks of Diffusion and Influence Manuel Gomez.
Lecture 9. If X is a discrete random variable, the mean (or expected value) of X is denoted μ X and defined as μ X = x 1 p 1 + x 2 p 2 + x 3 p 3 + ∙∙∙
The Dynamics of Viral Marketing Jure Leskovec Lada Adamic Bernardo A. Huberman Stanford University University of MichiganHP Labs Presented by Leman Akoglu.
Influence Maximization in Dynamic Social Networks Honglei Zhuang, Yihan Sun, Jie Tang, Jialin Zhang, Xiaoming Sun.
School of Computer Science Carnegie Mellon University 1 The dynamics of viral marketing Jure Leskovec, Carnegie Mellon University Lada Adamic, University.
Maximizing the Spread of Influence through a Social Network David Kempe, Jon Kleinberg, Eva Tardos Cornell University KDD 2003.
Professor Yashar Ganjali Department of Computer Science University of Toronto
Maximizing the Spread of Influence through a Social Network Authors: David Kempe, Jon Kleinberg, É va Tardos KDD 2003.
Online Social Networks and Media
I NFORMATION C ASCADE Priyanka Garg. OUTLINE Information Propagation Virus Propagation Model How to model infection? Inferring Latent Social Networks.
 Probability in Propagation. Transmission Rates  Models discussed so far assume a 100% transmission rate to susceptible individuals (e.g. Firefighter.
Inference: Probabilities and Distributions Feb , 2012.
CS 590 Term Project Epidemic model on Facebook
Algorithms For Solving History Sensitive Cascade in Diffusion Networks Research Proposal Georgi Smilyanov, Maksim Tsikhanovich Advisor Dr Yu Zhang Trinity.
Cost-effective Outbreak Detection in Networks Presented by Amlan Pradhan, Yining Zhou, Yingfei Xiang, Abhinav Rungta -Group 1.
Steffen Staab 1WeST Web Science & Technologies University of Koblenz ▪ Landau, Germany Network Theory and Dynamic Systems Cascading.
Chapter 7 Data for Decisions. Population vs Sample A Population in a statistical study is the entire group of individuals about which we want information.
1 Lecture 16 Epidemics University of Nevada – Reno Computer Science & Engineering Department Fall 2015 CS 791 Special Topics: Network Architectures and.
Topics In Social Computing (67810) Module 2 (Dynamics) Cascades, Memes, and Epidemics (Networks Crowds & Markets Ch. 21)
Inferring Networks of Diffusion and Influence
Failures and Epidemics in Networks
Independent Cascade Model and Linear Threshold Model
What Stops Social Epidemics?
Chapter 5 Sampling Distributions
Chapter 5 Sampling Distributions
Independent Cascade Model and Linear Threshold Model
Maximizing the Spread of Influence through a Social Network
The Importance of Communities for Learning to Influence
Discrete Event Simulation - 4
Chapter 5 Sampling Distributions
Cost-effective Outbreak Detection in Networks
Large Graph Mining: Power Tools and a Practitioner’s guide
Decision Based Models of Cascades
Bharathi-Kempe-Salek Conjecture
Network Effects and Cascading Behavior
The dynamics of viral marketing
Viral Marketing over Social Networks
Independent Cascade Model and Linear Threshold Model
Diffusion in Networks
Presentation transcript:

COMP 621U Week 3 Social Influence and Information Diffusion Nathan Liu (nliu@cse.ust.hk)

What are Social Influences? People make decisions sequentially Actions of earlier people affect that of later people Two class of rational reasons for influence: Direct benefit: Phone becomes more useful if more people use it Informational: Choosing restaurants Influences are the results of rational inferences from limited information.

Herding: Simple Experiment Consider an urn with 3 ball. It can be either: Majority-blue: 2 blue 1 red Majority-red: 2 red, 1 blue Each person wants to best guess whether the urn is majority is majority-blue or majority-red: Experiment: One by one each person: Draws a ball Privately looks at its color ad puts it back Publicly announces his guess Everyone see all the guesses beforehand How should you guess?

Herding: What happens? What happens? 1st person: guess the color drawn 2nd person: guess the color drawn 3rd person: If the two before made different guesses, then go with his own color Else: just go with their guess (regardless of the color you see) Can be modeled Bayesian rule(the first two guesses may bias the prior) P(R|rrb)=P(rrb|R)P(R)/P(rrb)=2/3 Non-optimal outcome: With prob 1/3×1/3=1/9, the first two would see the wrong color, from then on the whole population would guess wrong

Examples: Information Diffusion

Example: Viral Propagation

Example: Viral Marketing Recommendation referral program: Senders and followers of recommendations receive discounts on products

Early Empirical Studies of Diffusion and Influence Sociological study of diffusion of innovation: Spread of new agricultural practices[Ryan-Gross 1943] Studied the adoption of a new hybrid-corn between the 259 farmers in Iowa Found that interpersonal network plays important role Spread of new medical practices [Coleman et al 1966] Studied the adoption of new drug between doctors in Illinois Clinical studies and scientific evaluation were not sufficient to convince doctors It was the social power of peers that led to adoption The contagion of obesity [Christakis et al. 2007] If you have an overweight friend, your chance of becoming obese increase by 57%!

Applications of Social Influence Models Forward Predictions: viral marketing, influence maximization Backward Predictions: effector/initiator finding, sensor placement, cascade detection Backward network engineering Forward network engineering Learn from observed data Backward predictions Forward predictions

Dynamics of Viral Marketing (Leskovec 07) 10 Senders and followers of recommendations receive discounts on products 10% credit 10% off Recommendations are made to any number of people at the time of purchase Only the recipient who buys first gets a discount

Statistics by Product Group 11 products customers recommenda-tions edges buy + get discount buy + no discount Book 103,161 2,863,977 5,741,611 2,097,809 65,344 17,769 DVD 19,829 805,285 8,180,393 962,341 17,232 58,189 Music 393,598 794,148 1,443,847 585,738 7,837 2,739 Video 26,131 239,583 280,270 160,683 909 467 Full 542,719 3,943,084 15,646,121 3,153,676 91,322 79,164 people recommendations high low

Does receiving more recommendations increase the likelihood of buying? DVDs BOOKS

Does sending more recommendations influence more purchases? DVDs BOOKS

The probability that the sender gets a credit with increasing numbers of recommendations consider whether sender has at least one successful recommendation controls for sender getting credit for purchase that resulted from others recommending the same product to the same person probability of receiving a credit levels off for DVDs

Multiple recommendations between two individuals weaken the impact of the bond on purchases DVDs BOOKS

Processes and Dynamics Influence (Diffusion, Cascade): Each node get to make decisions based on which and how many of its neighbors adopted a new idea or innovation. Rational decision making process. Known mechanics. Infection (Contagion, Propagation): Randomly occur as a result of social contact. No decision making involved. Unknown mechanics.

Mathematical Models Models of Influence [Easley10a]: Independent Cascade Model Threshold Model Questions: Who are the most influential nodes? How to detect cascade? Models of Infection [Easley 10b]: SIS: Susceptible-Infective-Susceptible (e.g., flu) SIR: Susceptible-Infective-Recovered (e.g., chickenpox) Will the virus take over the network?

Common Properties of Influence Modeling A social network is represented a directed graph, with each actor being one node; Each node is started as active or inactive; A node, once activated, will activate his neighboring nodes; Once a node is activated, this node cannot be deactivated.

Diffusion Curves Basis for models: What is the dependence? Probability of adopting new behavior depends on the number of friends who already adopted What is the dependence? Different shapes has consequences for models of diffusion

Real World Diffusion Curves DVD recommendation and LiveJournal community membership

Linear Threshold Model An actor would take an action if the number of his friends who have taken the action exceeds (reaches) a certain threshold Each node v chooses a threshold ϴv randomly from a uniform distribution in an interval between 0 and 1. In each discrete step, all nodes that were active in the previous step remain active The nodes satisfying the following condition will be activated

Linear Threshold Diffusion Process

Independent Cascade Model The independent cascade model focuses on the sender’s rather than the receiver’s view A node w, once activated at step t , has one chance to activate each of its neighbors randomly For a neighboring node (say, v), the activation succeeds with probability pw,v (e.g. p = 0.5) If the activation succeeds, then v will become active at step t + 1 In the subsequent rounds, w will not attempt to activate v anymore. The diffusion process, starts with an initial activated set of nodes, then continues until no further activation is possible

Independent Cascade Model Diffusion Process

How should we organize revolt? You live an in oppressive society You know of a demonstration against the government planned tomorrow If a lot of people show up, the government will fall If only a few people show up, the demonstrators will be arrested and it would have been better had everyone stayed at home

Pluralistic Ignorance You should do something if you believe you are in the majority! Dictator tip: Pluralistic ignorance – erroneous estimates about the prevalence of certain opinions in the population Survey conducted in the U.S. in 1970 showed that while a clear minority of white Americans at that point favored racial segregation, significantly more than 50% believed it was favored by a majority of white Americans in their region of the country.

Organizing the Revolt: The Model Personal threshold k: “I will show up if am sure at least k people in total (including myself) will show up” Each node only knows the thresholds and attitudes of all their direct friends. Can we predict if a revolt can happened based on the network structure?

Which Network Can Have a Revolt?

Influence Maximization (Kempe03) If S is initial active set let σ(S) denote expected size of final active set Most influential set of size k: the set S of k nodes producing largest expected cascade size σ (S) if activated. A discrete optimization problem NP-Hard and highly inapproximable

An Approximation Result Diminishing returns: Hill-climbing: repeatedly select node with maximum marginal gain Analysis: diminishing returns at individual nodes cascade size σ (S) grows slower and slower with S (i.e. f is submodular) Theorem: if f is a monotonic submodular function, the k- step hill climbing produces set S for which σ (S) is within (1-1/e) of optimal σ(S) for both threshold and independent cascade model are submodular.

Submodularity for Independent Cascade 0.6 Coins for edges are flipped during activation attempts. Can pre-flip all coins and reveal results immediately. 0.2 0.2 0.3 0.1 0.4 0.5 0.3 0.5 Our proof deals with these difficulties by formulating an equivalent view of the process, which makes it easier to see that there is an order-independent outcome, and which provides an alternate way to reason about the submodularity property. From the point of view of the process, it clearly does not matter whether the coin was flipped at the moment that v became active, or whether it was flipped at the very beginning of the whole process and is only being revealed now. With all the coins flipped in advance, the process can be viewed as follows. The edges in G for which the coin flip indicated an activation will be successful are declared to be live; the remaining edges are declared to be blocked. If we fix the outcomes of the coin flips and then initially activate a set A, it is clear how to determine the full set of active nodes at the end of the cascade process: CLAIM 2.3. A node x ends up active if and only if there is a path from some node in A to x consisting entirely of live edges. (We will call such a path a live-edge path.) Active nodes in the end are reachable via green paths from initially targeted nodes. Study reachability in green graphs

Submodularity, Fixed Graph Fix “green graph” G. g(S) are nodes reachable from S in G. Submodularity: g(T +v) - g(T) g(S +v) - g(S) when S T. g(S +v) - g(S): nodes reachable from S + v, but not from S. From the picture: g(T +v) - g(T) g(S +v) - g(S) when S T (indeed!). g(S +v) - g(S): Exactly nodes reachable from v, but not from S.

Submodularity of the Function Fact: A non-negative linear combination of submodular functions is submodular gG(S): nodes reachable from S in G. Each gG(S): is submodular (previous slide). Probabilities are non-negative.

Models of Infection (Virus Propagation) How do virus/rumors propagate? Will a flu-like virus linger or will it die out soon? (Virus) birth rate β : probability that an infected neighbor attacks (Virus) death rate δ : probability that an infected neighbor recovers

General Schemes

Susceptible-Infected-Recovered (SIR) Model Process: Initially, some nodes are in the I state and all others in the S state. Each node v in the I state remains infectious for a fixed number of steps t During each of the t steps, node v can infect each of its susceptible neighbors with probability p. After t steps, v is no longer infectious or susceptible to further infections and enters state R. SIR is suitable for modeling a disease that each individual can only catches once during their life time.

Example SIR epidemic, t=1

Susceptible-Infected-Susceptible (SIS) Model Cured nodes immediately become susceptible again. Virus “strength”: s= β/ δ

Example SIS Epidemic

Connection between SIS and SIR SIS model with t=1 can be represented as an SIS model by creating a separate copy of each node for each time step.

Question: Epidemic Threshold The epidemic threshold of a graph is a value of τ, such that If strength s= β/ δ< τ, then an epidemic can not happen What should τ depend on? Avg. degree? And/or highest degree? And/or variance of degree? And/or diameter?

Epidemic threshold in SIS model We have no epidemic if: Epidemic threshold Death rate Birth rate Largest eigenvalue of adjacency matrix A

Simulation Studies:

Experiments: Does it matter how many people are initially infected?

References: [Kempe03] D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence Through a Social Network. KDD’03 [Leskovec06] J. Leskovec, L. Adamic, B. Huberman. The Dynamics of Viral Marketing. EC’06 [Easley10a] D. Easley, J. Kleinberg. Networks, Crowds and Markets, Ch19 [Easley10b] D. Easley, J. Kleinberg. Networks, Crowds and Markets, Ch20