# Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos

## Presentation on theme: "Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos"— Presentation transcript:

Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash http://www.cs.cmu.edu/~badityap Christos Faloutsos http://www.cs.cmu.edu/~christos Carnegie Mellon University

Thank you! Ying Ding Jiawei Han Jie Tang Philip Yu KDD-MDS-2012C. Faloutsos (CMU)2

Networks are everywhere! Human Disease Network [Barabasi 2007] Gene Regulatory Network [Decourty 2008] Facebook Network [2010] The Internet [2005] C. Faloutsos (CMU)3KDD-MDS-2012

Dynamical Processes over networks are also everywhere! C. Faloutsos (CMU)4KDD-MDS-2012

Why do we care? Information Diffusion Viral Marketing Epidemiology and Public Health Cyber Security Human mobility Games and Virtual Worlds Ecology Social Collaboration........ C. Faloutsos (CMU)5KDD-MDS-2012

Why do we care? (1: Epidemiology) Dynamical Processes over networks [AJPH 2007] CDC data: Visualization of the first 35 tuberculosis (TB) patients and their 1039 contacts Diseases over contact networks C. Faloutsos (CMU)6KDD-MDS-2012

Why do we care? (1: Epidemiology) Dynamical Processes over networks Each circle is a hospital ~3000 hospitals More than 30,000 patients transferred [US-MEDICARE NETWORK 2005] Problem: Given k units of disinfectant, whom to immunize? C. Faloutsos (CMU)7KDD-MDS-2012

Why do we care? (1: Epidemiology) CURRENT PRACTICEOUR METHOD ~6x fewer! [US-MEDICARE NETWORK 2005] C. Faloutsos (CMU)8KDD-MDS-2012 Hospital-acquired inf. took 99K+ lives, cost \$5B+ (all per year)

Why do we care? (2: Online Diffusion) > 800m users, ~\$1B revenue [WSJ 2010] ~100m active users > 50m users C. Faloutsos (CMU)9KDD-MDS-2012

Why do we care? (2: Online Diffusion) Dynamical Processes over networks Celebrity Buy Versace™! Followers Social Media Marketing C. Faloutsos (CMU)10KDD-MDS-2012

High Impact – Multiple Settings Q. How to squash rumors faster? Q. How do opinions spread? Q. How to market better? epidemic out-breaks products/viruses transmit s/w patches C. Faloutsos (CMU)11KDD-MDS-2012

Research Theme DATA Large real-world networks & processes ANALYSIS Understanding POLICY/ ACTION Managing C. Faloutsos (CMU)12KDD-MDS-2012

In this talk ANALYSIS Understanding Given propagation models: Q1: Will an epidemic happen? C. Faloutsos (CMU)13KDD-MDS-2012

In this talk Q2: How to immunize and control out-breaks better? POLICY/ ACTION Managing C. Faloutsos (CMU)14KDD-MDS-2012

Outline Part 1: anomaly detection Part 2: influence propagation Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)15KDD-MDS-2012

A fundamental question Strong Virus Epidemic? C. Faloutsos (CMU)16KDD-MDS-2012

example (static graph) Weak Virus Epidemic? C. Faloutsos (CMU)17KDD-MDS-2012

Problem Statement Find, a condition under which – virus will die out exponentially quickly – regardless of initial infection condition above (epidemic) below (extinction) # Infected time Separate the regimes? C. Faloutsos (CMU)18KDD-MDS-2012

Threshold (static version) Problem Statement Given: – Graph G, and – Virus specs (attack prob. etc.) Find: – A condition for virus extinction/invasion C. Faloutsos (CMU)19KDD-MDS-2012

Threshold: Why important? Accelerating simulations Forecasting (‘What-if’ scenarios) Design of contagion and/or topology A great handle to manipulate the spreading – Immunization – Maximize collaboration ….. C. Faloutsos (CMU)20KDD-MDS-2012

Outline Motivation Epidemics: what happens? (Theory) – Background – Result (Static Graphs) – Proof Ideas (Static Graphs) – Bonus 1: Dynamic Graphs – Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)21KDD-MDS-2012

“SIR” model: life immunity (mumps) Each node in the graph is in one of three states – Susceptible (i.e. healthy) – Infected – Removed (i.e. can’t get infected again) Prob. β Prob. δ t = 1t = 2t = 3 C. Faloutsos (CMU)22KDD-MDS-2012

Terminology: continued Other virus propagation models (“VPM”) – SIS : susceptible-infected-susceptible, flu-like – SIRS : temporary immunity, like pertussis – SEIR : mumps-like, with virus incubation (E = Exposed) ….…………. Underlying contact-network – ‘who-can-infect- whom’ C. Faloutsos (CMU)23KDD-MDS-2012

Related Work  R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991.  A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks. Cambridge University Press, 2010.  F. M. Bass. A new product growth for model consumer durables. Management Science, 15(5):215–227, 1969.  D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real networks. ACM TISSEC, 10(4), 2008.  D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, 2010.  A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of epidemics. IEEE INFOCOM, 2005.  Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003.  H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000.  H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer Lecture Notes in Biomathematics, 46, 1984.  J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer viruses. IEEE Computer Society Symposium on Research in Security and Privacy, 1991.  J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE Computer Society Symposium on Research in Security and Privacy, 1993.  R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters 86, 14, 2001.  ……… All are about either: Structured topologies (cliques, block-diagonals, hierarchies, random) Specific virus propagation models Static graphs All are about either: Structured topologies (cliques, block-diagonals, hierarchies, random) Specific virus propagation models Static graphs C. Faloutsos (CMU)24KDD-MDS-2012

Outline Motivation Epidemics: what happens? (Theory) – Background – Result (Static Graphs) – Proof Ideas (Static Graphs) – Bonus 1: Dynamic Graphs – Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)25KDD-MDS-2012

How should the answer look like? Answer should depend on: – Graph – Virus Propagation Model (VPM) But how?? – Graph – average degree? max. degree? diameter? – VPM – which parameters? – How to combine – linear? quadratic? exponential? ….. C. Faloutsos (CMU)26KDD-MDS-2012

Static Graphs: Our Main Result Informally, For,  any arbitrary topology (adjacency matrix A)  any virus propagation model (VPM) in standard literature the epidemic threshold depends only 1.on the λ, ﬁrst eigenvalue of A, and 2.some constant, determined by the virus propagation model λ λ No epidemic if λ * < 1 C. Faloutsos (CMU)27KDD-MDS-2012 In Prakash+ ICDM 2011 (Selected among best papers). w/ Deepay Chakrabarti

Our thresholds for some models s = effective strength s < 1 : below threshold Models Effective Strength (s) Threshold (tipping point) SIS, SIR, SIRS, SEIR s = λ. s = 1 SIV, SEIV s = λ. ( H.I.V. ) s = λ. C. Faloutsos (CMU)28KDD-MDS-2012

Our result: Intuition for λ “Official” definition: Let A be the adjacency matrix. Then λ is the root with the largest magnitude of the characteristic polynomial of A [det(A – xI)]. Doesn’t give much intuition! “Un-official” Intuition λ ~ # paths in the graph u u ≈. (i, j) = # of paths i  j of length k C. Faloutsos (CMU)29KDD-MDS-2012

Largest Eigenvalue (λ) λ ≈ 2λ = Nλ = N-1 N = 1000 λ ≈ 2λ= 31.67λ= 999 better connectivity higher λ C. Faloutsos (CMU)30KDD-MDS-2012 N nodes

Examples: Simulations – SIR (mumps) (a) Infection profile (b) “Take-off” plot PORTLAND graph: synthetic population, 31 million links, 6 million nodes Fraction of Infections Footprint Effective Strength Time ticks C. Faloutsos (CMU)31KDD-MDS-2012

Examples: Simulations – SIRS (pertusis) Fraction of Infections Footprint Effective StrengthTime ticks (a) Infection profile (b) “Take-off” plot PORTLAND graph: synthetic population, 31 million links, 6 million nodes C. Faloutsos (CMU)32KDD-MDS-2012

λ * < 1 Graph-based Model-based 33 General VPM structure Topology and stability See paper for full proof KDD-MDS-2012C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) – Background – Result (Static Graphs) – Proof Ideas (Static Graphs) – Bonus 1: Dynamic Graphs – Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)34KDD-MDS-2012

λ * < 1 Graph-based Model-based General VPM structure Topology and stability See paper for full proof 35KDD-MDS-2012C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) – Background – Result (Static Graphs) – Proof Ideas (Static Graphs) – Bonus 1: Dynamic Graphs – Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)36KDD-MDS-2012

Virus Propagation on Time- Varying Networks: Theory and Immunization Algorithms B. Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos, Christos Faloutsos in ECML-PKDD 2010 KDD-MDS-2012C. Faloutsos (CMU)37

Dynamic Graphs: Epidemic? adjacency matrix 8 8 Alternating behaviors DAY (e.g., work) C. Faloutsos (CMU)38KDD-MDS-2012

adjacency matrix 8 8 Dynamic Graphs: Epidemic? Alternating behaviors NIGHT (e.g., home) C. Faloutsos (CMU)39KDD-MDS-2012

SIS model – recovery rate δ – infection rate β Set of T arbitrary graphs Model Description day N N night N N, weekend….. Infected Healthy XN1 N3 N2 Prob. β Prob. δ C. Faloutsos (CMU)40KDD-MDS-2012

Informally, NO epidemic if eig (S) = < 1 Our result: Dynamic Graphs Threshold Single number! Largest eigenvalue of The system matrix S In Prakash+, ECML-PKDD 2010 S = C. Faloutsos (CMU)41KDD-MDS-2012

Synthetic MIT Reality Mining log(fraction infected) Time BELOW AT ABOVE AT BELOW Infection-profile C. Faloutsos (CMU)42KDD-MDS-2012

“Take-off” plots Footprint (# infected @ “steady state”) Our threshold (log scale) NO EPIDEMIC EPIDEMIC NO EPIDEMIC SyntheticMIT Reality C. Faloutsos (CMU)43KDD-MDS-2012

Outline Motivation Epidemics: what happens? (Theory) – Background – Result (Static Graphs) – Proof Ideas (Static Graphs) – Bonus 1: Dynamic Graphs – Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)44KDD-MDS-2012

Winner-takes-all: Competing Viruses on fair-play networks B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos, WWW 2012 KDD-MDS-2012C. Faloutsos (CMU)45

Competing Contagions iPhone v AndroidBlu-ray v HD-DVD 46KDD-MDS-2012C. Faloutsos (CMU) Biological common flu/avian flu, pneumococcal inf etc

A simple model Modified flu-like Mutual Immunity (“pick one of the two”) Susceptible-Infected1-Infected2-Susceptible Virus 1 Virus 2 C. Faloutsos (CMU)47KDD-MDS-2012

Question: What happens in the end? green: virus 1 red: virus 2 Footprint @ Steady State = ? Number of Infections C. Faloutsos (CMU)48KDD-MDS-2012 ASSUME: Virus 1 is stronger than Virus 2

Question: What happens in the end? green: virus 1 red: virus 2 Number of Infections Strength ?? = Strength 2 Footprint @ Steady State 49KDD-MDS-2012C. Faloutsos (CMU) ASSUME: Virus 1 is stronger than Virus 2

Answer: Winner-Takes-All green: virus 1 red: virus 2 Number of Infections 50KDD-MDS-2012C. Faloutsos (CMU) ASSUME: Virus 1 is stronger than Virus 2

Our Result: Winner-Takes-All Given our model, and any graph, the weaker virus always dies-out completely 1.The stronger survives only if it is above threshold 2.Virus 1 is stronger than Virus 2, if: strength(Virus 1) > strength(Virus 2) 3.Strength(Virus) = λ β / δ  same as before! 51KDD-MDS-2012C. Faloutsos (CMU) In Prakash, Beutel, + WWW 2012

More results: in KDD’12 Partially competing viruses/products; Collaborating products Interacting Viruses on a Network: Can both survive? Alex Beutel, B. Aditya Prakash, Roni Rosenfeld and Christos Faloutsos in SIGKDD 2012, Beijing KDD-MDS-2012C. Faloutsos (CMU)52

Real Examples Reddit v DiggBlu-Ray v HD-DVD [Google Search Trends data] 53KDD-MDS-2012C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) C. Faloutsos (CMU)54KDD-MDS-2012

On the Vulnerability of Large Graphs Hanghang Tong, B. Aditya Prakash, Charalampos Tsourakakis, Tina Eliassi-Rad, Christos Faloutsos, Duen Horng (Polo) Chau, ICDM 2010 KDD-MDS-2012C. Faloutsos (CMU)55

? ? Given: a graph A, virus prop. model and budget k; Find: k ‘best’ nodes for immunization (removal). k = 2 ? ? Full Static Immunization C. Faloutsos (CMU)56KDD-MDS-2012

Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) – Full Immunization (Static Graphs) – Fractional Immunization C. Faloutsos (CMU)57KDD-MDS-2012

Challenges Given a graph A, budget k, Q1 (Metric) How to measure the ‘shield- value’ for a set of nodes (S)? Q2 (Algorithm) How to find a set of k nodes with highest ‘shield-value’? C. Faloutsos (CMU)58KDD-MDS-2012

Proposed vulnerability measure λ Increasing λ Increasing vulnerability λ is the epidemic threshold “Safe”“Vulnerable”“Deadly” C. Faloutsos (CMU)59KDD-MDS-2012

1 9 10 3 4 5 7 8 6 2 9 1 11 10 3 4 5 6 7 8 2 9 Original GraphWithout {2, 6} Eigen-Drop(S) Δ λ = λ - λ s Eigen-Drop(S) Δ λ = λ - λ s Δ A1: “Eigen-Drop”: an ideal shield value C. Faloutsos (CMU)60KDD-MDS-2012

(Q2) - Direct Algorithm too expensive! Immunize k nodes which maximize Δ λ S = argmax Δ λ Combinatorial! Complexity: – Example: 1,000 nodes, with 10,000 edges It takes 0.01 seconds to compute λ It takes 2,615 years to find 5-best nodes ! C. Faloutsos (CMU)61KDD-MDS-2012

A2: Our Solution Part 1: Shield Value – Carefully approximate Eigen-drop (Δ λ) – Matrix perturbation theory Part 2: Algorithm – Greedily pick best node at each step – Near-optimal due to submodularity NetShield (linear complexity) – O(nk 2 +m) n = # nodes; m = # edges C. Faloutsos (CMU)62KDD-MDS-2012 In Tong, Prakash+ ICDM 2010

Experiment: Immunization quality Log(fraction of infected nodes) NetShield Degree PageRank Eigs (=HITS) Acquaintance Betweeness (shortest path) Lower is better Time C. Faloutsos (CMU)63KDD-MDS-2012

Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) – Full Immunization (Static Graphs) – Fractional Immunization C. Faloutsos (CMU)64KDD-MDS-2012

Fractional Immunization of Networks B. Aditya Prakash, Lada Adamic, Theodore Iwashyna (M.D.), Hanghang Tong, Christos Faloutsos Under review C. Faloutsos (CMU)65KDD-MDS-2012

Fractional Asymmetric Immunization Hospital Another Hospital Drug-resistant Bacteria (like XDR-TB) C. Faloutsos (CMU)66KDD-MDS-2012

Fractional Asymmetric Immunization Hospital Another Hospital Drug-resistant Bacteria (like XDR-TB) C. Faloutsos (CMU)67KDD-MDS-2012

Fractional Asymmetric Immunization Hospital Another Hospital Problem: Given k units of disinfectant, how to distribute them to maximize hospitals saved? C. Faloutsos (CMU)68KDD-MDS-2012

Our Algorithm “SMART- ALLOC” CURRENT PRACTICESMART-ALLOC [US-MEDICARE NETWORK 2005] Each circle is a hospital, ~3000 hospitals More than 30,000 patients transferred ~6x fewer! C. Faloutsos (CMU)69KDD-MDS-2012

Running Time ≈ SimulationsSMART-ALLOC > 1 week 14 secs > 30,000x speed-up! Wall-Clock Time Lower is better C. Faloutsos (CMU)70KDD-MDS-2012

Experiments K = 200K = 2000 PENN-NETWORK SECOND-LIFE ~5 x ~2.5 x Lower is better C. Faloutsos (CMU)71KDD-MDS-2012

Acknowledgements Funding C. Faloutsos (CMU)72KDD-MDS-2012

References 1. Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B. Aditya Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) - In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.) 2. Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B. Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain 3. Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point (Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain 4. Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon 5. On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina Eliassi- Rad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia 6. Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission 7. Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under Submission 73 http://www.cs.cmu.edu/~badityap/ KDD-MDS-2012C. Faloutsos (CMU)

Analysis Policy/Action Data Propagation on Large Networks B. Aditya Prakash Christos Faloutsos 74KDD-MDS-2012C. Faloutsos (CMU)