Stability of Influence Maximization

Presentation on theme: "Stability of Influence Maximization"— Presentation transcript:

Stability of Influence Maximization
Xinran He and David Kempe University of Southern California {xinranhe, 08/26/2014

Diffusion In Social Networks
0.8 0.2 0.5 0.2 0.6 0.3 0.7 Some friend did and someone did not The adoption of new products can propagate in the social network Diffusion in the social network He & Kempe (USC) Influence Stability KDD 2014

IC Model & Influence Maximization
He & Kempe (USC) Influence Stability KDD 2014 Independent Cascade (IC) Model: Each newly activated node 𝑢 has a single chance to activate each inactive neighbor 𝑣 with probability 𝑝 𝑢,𝑣 . Influence Maximization: Find 𝑘 people that generate the largest influence spread (i.e. expected number of activated nodes) [KKT 2003] Where do parameters 𝒑 𝒖,𝒗 come from?

Uncertainty in Influence Strength
He & Kempe(USC) Influence Stability KDD 2014 Diffusion History Questionnaire 0.8 0.5 0.3 0.6 0.1 0.4 Network Inference Influence Maximization Does such instability really exist? 0.5 0.6 0.1 0.3 0.4 0.9 Ground truth network

An Extreme Example 0.0625 0.0625 0.055 0.07 Select one seed 𝑝= 𝑝=
He & Kempe (USC) Influence Stability KDD 2014 Select one seed 𝑝= 0.0625 0.0625 0.055 0.07 𝑝=

An Extreme Example (Cont.)
He & Kempe (USC) Influence Stability KDD 2014 Simple slide with only one animation 𝑝= 0.3 0.25 𝑝= 0.3 0.35

Diagnosing Instability
He & Kempe (USC) Influence Stability KDD 2014 Given an instance of Influence Maximization, can we diagnose efficiently whether it is stable or unstable? How about this network? Complete answer ⇒ Computing percolation threshold of any graph. 0.8 0.5 0.3 0.6 0.1 0.4 Partial solution Unstable instances ⇒ Fire an alarm correctly. Instability diagnosis correctly Stable instances ⇒ Possible false alarms.

Definition of Stability
He & Kempe (USC) Influence Stability KDD 2014 Definition of Stability Model of misestimation: 1 𝑝 𝑢,𝑣 𝑝 𝑢,𝑣 ′ 𝑢 𝑣 𝑝 𝑢,𝑣 𝐼 𝑢,𝑣 Definition (Stability of Influence Maximization): An instance ( 𝑝 𝑢,𝑣 , 𝐼 𝑢,𝑣 ) is stable if the difference in influence is small for all legal 𝑝 𝑢,𝑣 ′ ∈ 𝐼 𝑢,𝑣 and all seed sets of size 𝑘.

Influence Difference Maximization
He & Kempe (USC) Influence Stability KDD 2014 max 𝑆 =𝑘 max 𝑝 𝑢,𝑣 ′ ∈ 𝐼 𝑢,𝑣 |𝜎 𝑆 −𝜎′(𝑆)| Optimization Problem: Definition (Influence Difference Maximization) : Given two instances with probabilities 𝑝 𝑢,𝑣 ≥ 𝑝 𝑢,𝑣 ′ for all 𝑢, 𝑣, let 𝜎 and 𝜎′ be the respective influence functions. Find a set S of size 𝑘 maximizing 𝜎 𝑆 −𝜎′(𝑆). Get rid of the animation for the optimization problem

Main Theory Result He & Kempe (USC) Influence Stability KDD 2014 Main Theorem: Under the IC model, 𝜎 𝑆 −𝜎′(𝑆) is a non-negative and submodular function of the set 𝑆 (but not monotone). Random Greedy Algorithm [Buchbinder et al.] Approximation guarantee: 0.266→1/𝑒 Running time: 𝑂(𝑘𝑀 𝑉 2 ) (𝑘≪|𝑉|) (𝑀:number of Monte-Carlo Simulations) Corollary : Assuming 𝐴 is the seed set returned by maximizing 𝜎 obs 𝑆 with greedy algorithm, we have 𝜎 true 𝐴 ≥𝑐⋅ 𝜎 true 𝐴 ∗ , where 𝑐 is a constant depending on the given instance.

Approximation Guarantee for InfMax
He & Kempe (USC) Influence Stability KDD 2014 1 𝑝 𝑢,𝑣 𝑝 𝑢,𝑣 ′ 𝑝 𝑢,𝑣 − 𝑝 𝑢,𝑣 + 𝐼 𝑢,𝑣 𝜽={ 𝑝 𝑢,𝑣 } , 𝜎(𝑆): the observed/inferred probabilities. 𝜽 = 𝑝 𝑢,𝑣 , 𝜎 𝑆 : the ground truth probabilities. 𝜽 + , 𝜽 − , 𝜎 + 𝑆 , 𝜎 − (𝑆) : making each probability as large/small as possible. Theorem: Assuming 𝐴 is the seed set returned by maximizng 𝜎 𝑆 with greedy algorithm, we have 𝜎 𝐴 ≥𝑐⋅ 𝜎 𝐴 , where 𝑐= 1− 𝛼 − ⋅ 1−1/𝑒 1+ 𝛼 + ⋅ 1−1/𝑒 𝛼 − = 𝛿 𝜽, 𝜽 − 𝐴 𝜎(𝐴) (By calculation) 𝛼 + = max 𝑆 𝛿 𝜽 + ,𝜽 𝑆 𝜎(𝐴) (By Inf Diff Max)

Experiments: Setting 𝑝 𝑢,𝑣 Networks
He & Kempe (USC) Influence Stability KDD 2014 Networks Synthetic: 2D-grid, random regular graphs, small world, preferential attachment Real: STOCFOCS: co-authorship network, |V|=1768, |E|=10024 HAITI: retweet network, |V|=274, |E|=383 Set 𝐼 𝑢,𝑣 =[ 1−Δ 𝑝 𝑢,𝑣 , (1+Δ) 𝑝 𝑢,𝑣 ] Vary 𝑝 𝑢,𝑣 ∈{0.01,0.02,0.05,0.1} Vary Δ∈ 1%,5%,10%,20%,50% Metrics: Inf Diff Max: max 𝑆 𝛿 𝑆 Approximation guarantee: 𝑐 such that 𝜎 ob𝑠 𝐴 ≥𝑐⋅ 𝜎 true 𝐴 ∗ 1 𝑝 𝑢,𝑣 𝑝 𝑢,𝑣 Δ

Experiments: PA network
He & Kempe (USC) Influence Stability KDD 2014 𝒎𝒂𝒙 𝑺 𝜹 𝑺 Approximation guarantee 𝒄

Experiments: STOCFOCS
He & Kempe (USC) Influence Stability KDD 2014 𝒎𝒂𝒙 𝑺 𝜹 𝑺 Approximation guarantee 𝒄

Conclusion Noise is everywhere in social network data
He & Kempe (USC) Influence Stability KDD 2014 Noise is everywhere in social network data Influence Maximization could be unstable Calls into question practicality of algorithmic approaches Instability can be diagnosed by solving Influence Difference Maximization Via non-monotone submodular maximization Experiments on synthetic networks (2D-grid, random regular, SW, PA) and real networks (retweet, collaboration) 10% relative noise ⇒ Decent approximation 20% relative noise ⇒ Significant Challenge Further extension: Linear Threshold Model, Triggering Model transition

Future work Generalization to other diffusion models.
He & Kempe (USC) Influence Stability KDD 2014 Generalization to other diffusion models. Generalized Threshold (GT) model Generalization to other misestimation models. Current assumption: each deviation is bounded What if the total (squared) deviation is bounded? Big picture: How accurate are our diffusion models?

Questions?