Christos Faloutsos CMU

Slides:



Advertisements
Similar presentations
Números.
Advertisements

University Paderborn 07 January 2009 RG Knowledge Based Systems Prof. Dr. Hans Kleine Büning Reinforcement Learning.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
AGVISE Laboratories %Zone or Grid Samples – Northwood laboratory
Trend for Precision Soil Testing % Zone or Grid Samples Tested compared to Total Samples.
PDAs Accept Context-Free Languages
ALAK ROY. Assistant Professor Dept. of CSE NIT Agartala
Reflection nurulquran.com.
EuroCondens SGB E.
Worksheets.
Reinforcement Learning
Addition and Subtraction Equations
By John E. Hopcroft, Rajeev Motwani and Jeffrey D. Ullman
1 When you see… Find the zeros You think…. 2 To find the zeros...
Western Public Lands Grazing: The Real Costs Explore, enjoy and protect the planet Forest Guardians Jonathan Proctor.
Add Governors Discretionary (1G) Grants Chapter 6.
CALENDAR.
CHAPTER 18 The Ankle and Lower Leg
ASCII stands for American Standard Code for Information Interchange
The 5S numbers game..
突破信息检索壁垒 -SciFinder Scholar 介绍
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
The basics for simulations
Factoring Quadratics — ax² + bx + c Topic
MM4A6c: Apply the law of sines and the law of cosines.
Figure 3–1 Standard logic symbols for the inverter (ANSI/IEEE Std
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Dynamic Access Control the file server, reimagined Presented by Mark on twitter 1 contents copyright 2013 Mark Minasi.
TCCI Barometer March “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
Progressive Aerobic Cardiovascular Endurance Run
CSE 6007 Mobile Ad Hoc Wireless Networks
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
Facebook Pages 101: Your Organization’s Foothold on the Social Web A Volunteer Leader Webinar Sponsored by CACO December 1, 2010 Andrew Gossen, Senior.
TCCI Barometer September “Establishing a reliable tool for monitoring the financial, business and social activity in the Prefecture of Thessaloniki”
When you see… Find the zeros You think….
2011 WINNISQUAM COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=1021.
Before Between After.
2011 FRANKLIN COMMUNITY SURVEY YOUTH RISK BEHAVIOR GRADES 9-12 STUDENTS=332.
2.10% more children born Die 0.2 years sooner Spend 95.53% less money on health care No class divide 60.84% less electricity 84.40% less oil.
Foundation Stage Results CLL (6 or above) 79% 73.5%79.4%86.5% M (6 or above) 91%99%97%99% PSE (6 or above) 96%84%100%91.2%97.3% CLL.
Numeracy Resources for KS2
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Static Equilibrium; Elasticity and Fracture
ANALYTICAL GEOMETRY ONE MARK QUESTIONS PREPARED BY:
Resistência dos Materiais, 5ª ed.
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
UNDERSTANDING THE ISSUES. 22 HILLSBOROUGH IS A REALLY BIG COUNTY.
A Data Warehouse Mining Tool Stephen Turner Chris Frala
Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
Introduction Embedded Universal Tools and Online Features 2.
What impact does the address have on the tribe?
On the Vulnerability of Large Graphs
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
Schutzvermerk nach DIN 34 beachten 05/04/15 Seite 1 Training EPAM and CANopen Basic Solution: Password * * Level 1 Level 2 * Level 3 Password2 IP-Adr.
Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos
Dynamical Processes on Large Networks B. Aditya Prakash Carnegie Mellon University MMS, SIAM AN, Minneapolis, July 10,
CMU SCS Mining Billion-node Graphs Christos Faloutsos CMU.
Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos
CMU SCS Large Graph Mining Christos Faloutsos CMU.
School of Computer Science Carnegie Mellon University National Taiwan University of Science & Technology Unifying Guilt-by-Association Approaches: Theorems.
Winner-takes-all: Competing Viruses or Ideas on fair-play Networks B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos Carnegie Mellon University,
Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash Christos Faloutsos
ECML-PKDD 2010, Barcelona, Spain B. Aditya Prakash*, Hanghang Tong* ^, Nicholas Valler+, Michalis Faloutsos+, Christos Faloutsos* * Carnegie Mellon University,
Propagation on Large Networks B. Aditya Prakash Christos Faloutsos Carnegie Mellon University.
B. Aditya Prakash Naren Ramakrishnan
Large Graph Mining: Power Tools and a Practitioner’s guide
Presentation transcript:

Christos Faloutsos CMU Influence propagation in large graphs - theorems, algorithms, and case studies Christos Faloutsos CMU

Thank you! V.S. Subrahmanian Weiru Liu Jef Wijsen SUM'13 C. Faloutsos (CMU)

Outline Part 1: anomaly detection Part 2: influence propagation OddBall (anomaly detection) Belief Propagation Conclusions Part 2: influence propagation SUM'13 C. Faloutsos (CMU)

OddBall: Spotting Anomalies in Weighted Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School of Computer Science PAKDD 2010, Hyderabad, India

Main idea For each node, extract ‘ego-net’ (=1-step-away neighbors) Extract features (#edges, total weight, etc etc) Compare with the rest of the population SUM'13 C. Faloutsos (CMU)

What is an egonet? egonet ego SUM'13 C. Faloutsos (CMU) 6

Selected Features Ni: number of neighbors (degree) of ego i Ei: number of edges in egonet i Wi: total weight of egonet i λw,i: principal eigenvalue of the weighted adjacency matrix of egonet I SUM'13 C. Faloutsos (CMU) 7

Near-Clique/Star SOME OLD RULES SUM'13 C. Faloutsos (CMU) 8

Near-Clique/Star SOME OLD RULES SUM'13 C. Faloutsos (CMU) 9

Near-Clique/Star SOME OLD RULES SUM'13 C. Faloutsos (CMU) 10

Near-Clique/Star Andrew Lewis (director) SOME OLD RULES SUM'13 C. Faloutsos (CMU) 11

Outline Part 1: anomaly detection Part 2: influence propagation OddBall (anomaly detection) Belief Propagation Conclusions Part 2: influence propagation SUM'13 C. Faloutsos (CMU)

E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www’07] SUM'13 C. Faloutsos (CMU)

E-bay Fraud detection SUM'13 C. Faloutsos (CMU)

E-bay Fraud detection SUM'13 C. Faloutsos (CMU)

E-bay Fraud detection - NetProbe SUM'13 C. Faloutsos (CMU)

Popular press And less desirable attention: E-mail from ‘Belgium police’ (‘copy of your code?’) SUM'13 C. Faloutsos (CMU)

Outline OddBall (anomaly detection) Belief Propagation Conclusions Ebay fraud Symantec malware detection Unification results Conclusions SUM'13 C. Faloutsos (CMU)

Polonium: Tera-Scale Graph Mining and Inference for Malware Detection PATENT PENDING Polonium: Tera-Scale Graph Mining and Inference for Malware Detection SDM 2011, Mesa, Arizona Polo Chau Machine Learning Dept Carey Nachenberg Vice President & Fellow Jeffrey Wilhelm Principal Software Engineer Adam Wright Software Engineer Prof. Christos Faloutsos Computer Science Dept

Polonium: The Data 60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program 50+ million machines 900+ million executable files Constructed a machine-file bipartite graph (0.2 TB+) 1 billion nodes (machines and files) 37 billion edges As of today, has grown to more than three times SUM'13 C. Faloutsos (CMU)

Polonium: Key Ideas Use Belief Propagation to propagate domain knowledge in machine-file graph to detect malware Use “guilt-by-association” (i.e., homophily) E.g., files that appear on machines with many bad files are more likely to be bad Scalability: handles 37 billion-edge graph SUM'13 C. Faloutsos (CMU)

Polonium: One-Interaction Results Ideal 84.9% True Positive Rate 1% False Positive Rate True Positive Rate % of malware correctly identified for files reported by four or more machines False Positive Rate % of non-malware wrongly labeled as malware SUM'13 C. Faloutsos (CMU)

Outline Part 1: anomaly detection Part 2: influence propagation OddBall (anomaly detection) Belief Propagation Ebay fraud Symantec malware detection Unification results Conclusions Part 2: influence propagation SUM'13 C. Faloutsos (CMU)

Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms Danai Koutra Tai-You Ke U Kang Duen Horng (Polo) Chau Hsing-Kuo Kenneth Pao Christos Faloutsos Work in collaboration with National Taiwan University ECML PKDD, 5-9 September 2011, Athens, Greece

Problem Definition: GBA techniques ? Given: Graph; & few labeled nodes Find: labels of rest (assuming network effects) ? ? Classification problem where we assume that neighboring nodes are related Network effects – birds of a feather flock together ? SUM'13 C. Faloutsos (CMU)

Homophily and Heterophily NOT all methods handle heterophily BUT proposed method does! Step 1 All methods handle homophily Intensity of green related to the score of the node – birds of a feather flock together Eg. of network with heterophily: bad guys - fraudsters This relation can be either homophily or heterophily. All upcoming methods can handle homophily. Heterophily not all – proposed method CAN handle, others cannot Homophily: connected nodes are similar -> in the first step the neighbors of the green node will become green and the neighbors of the red node red. The same will happen in the second step – the color is diffused Heterophily: connected nodes are dissimilar -> In the first step the nodes connected to the green node become red and vice versa Step 2 SUM'13 C. Faloutsos (CMU)

Are they related? RWR (Random Walk with Restarts) google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) minimize the differences among neighbors BP (Belief propagation) send messages to neighbors, on what you believe about them SUM'13 C. Faloutsos (CMU)

YES! Are they related? RWR (Random Walk with Restarts) google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) minimize the differences among neighbors BP (Belief propagation) send messages to neighbors, on what you believe about them SUM'13 C. Faloutsos (CMU)

Correspondence of Methods Matrix Unknown known RWR [I – c AD-1] × x = (1-c)y SSL [I + a(D - A)] y FABP [I + a D - c’A] bh φh 0 1 0 1 0 1 ? 1 d1 d2 d3 final labels/ beliefs prior labels/ beliefs adjacency matrix SUM'13 C. Faloutsos (CMU)

Results: Scalability FABP is linear on the number of edges. # of edges (Kronecker graphs) runtime (min) Kronecker graphs FABP is linear on the number of edges. SUM'13 C. Faloutsos (CMU)

Results (5): Parallelism runtime (min) % accuracy FABP ~2x faster & wins/ties on accuracy. SUM'13 C. Faloutsos (CMU)

Faloutsos Conclusions Anomaly detection: hand-in-hand with pattern discovery (‘anomalies’ == ‘rare patterns’) ‘OddBall’ for large graphs ‘NetProbe’ and belief propagation: exploit network effects. FaBP: fast & accurate SUM'13 C. Faloutsos (CMU)

Outline Part 1: anomaly detection Part 2: influence propagation OddBall (anomaly detection) Belief Propagation Conclusions Part 2: influence propagation SUM'13 C. Faloutsos (CMU)

Influence propagation in large graphs - theorems and algorithms B. Aditya Prakash http://www.cs.cmu.edu/~badityap Christos Faloutsos http://www.cs.cmu.edu/~christos Carnegie Mellon University

Networks are everywhere! Facebook Network [2010] Gene Regulatory Network [Decourty 2008] Human Disease Network [Barabasi 2007] The Internet [2005] SUM'13 C. Faloutsos (CMU)

Dynamical Processes over networks are also everywhere! SUM'13 C. Faloutsos (CMU)

Why do we care? Information Diffusion Viral Marketing Epidemiology and Public Health Cyber Security Human mobility Games and Virtual Worlds Ecology Social Collaboration ........ SUM'13 C. Faloutsos (CMU)

Why do we care? (1: Epidemiology) Dynamical Processes over networks [AJPH 2007] CDC data: Visualization of the first 35 tuberculosis (TB) patients and their 1039 contacts Diseases over contact networks SUM'13 C. Faloutsos (CMU)

Why do we care? (1: Epidemiology) Dynamical Processes over networks Each circle is a hospital ~3000 hospitals More than 30,000 patients transferred [US-MEDICARE NETWORK 2005] Problem: Given k units of disinfectant, whom to immunize? SUM'13 C. Faloutsos (CMU)

Why do we care? (1: Epidemiology) ~6x fewer! [US-MEDICARE NETWORK 2005] CURRENT PRACTICE OUR METHOD Hospital-acquired inf. took 99K+ lives, cost $5B+ (all per year) SUM'13 C. Faloutsos (CMU)

Why do we care? (2: Online Diffusion) > 800m users, ~$1B revenue [WSJ 2010] ~100m active users > 50m users SUM'13 C. Faloutsos (CMU)

Why do we care? (2: Online Diffusion) Dynamical Processes over networks Buy Versace™! Celebrity Followers Social Media Marketing SUM'13 C. Faloutsos (CMU)

High Impact – Multiple Settings epidemic out-breaks Q. How to squash rumors faster? Q. How do opinions spread? Q. How to market better? products/viruses transmit s/w patches SUM'13 C. Faloutsos (CMU)

Large real-world networks & processes Research Theme ANALYSIS Understanding POLICY/ ACTION Managing DATA Large real-world networks & processes SUM'13 C. Faloutsos (CMU)

In this talk Given propagation models: Q1: Will an epidemic happen? ANALYSIS Understanding Given propagation models: Q1: Will an epidemic happen? SUM'13 C. Faloutsos (CMU)

In this talk Q2: How to immunize and control out-breaks better? POLICY/ ACTION Managing Q2: How to immunize and control out-breaks better? SUM'13 C. Faloutsos (CMU)

Outline Part 1: anomaly detection Part 2: influence propagation Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

A fundamental question Strong Virus Epidemic? SUM'13 C. Faloutsos (CMU)

example (static graph) Weak Virus Epidemic? SUM'13 C. Faloutsos (CMU)

Problem Statement Find, a condition under which # Infected above (epidemic) below (extinction) # Infected time Find, a condition under which virus will die out exponentially quickly regardless of initial infection condition Separate the regimes? SUM'13 C. Faloutsos (CMU)

Threshold (static version) Problem Statement Given: Graph G, and Virus specs (attack prob. etc.) Find: A condition for virus extinction/invasion SUM'13 C. Faloutsos (CMU)

Threshold: Why important? Accelerating simulations Forecasting (‘What-if’ scenarios) Design of contagion and/or topology A great handle to manipulate the spreading Immunization Maximize collaboration ….. SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

“SIR” model: life immunity (mumps) Background “SIR” model: life immunity (mumps) Each node in the graph is in one of three states Susceptible (i.e. healthy) Infected Removed (i.e. can’t get infected again) Prob. δ Prob. β t = 1 t = 2 t = 3 SUM'13 C. Faloutsos (CMU)

Terminology: continued Background Terminology: continued Other virus propagation models (“VPM”) SIS : susceptible-infected-susceptible, flu-like SIRS : temporary immunity, like pertussis SEIR : mumps-like, with virus incubation (E = Exposed) ….…………. Underlying contact-network – ‘who-can-infect-whom’ SUM'13 C. Faloutsos (CMU)

Background Related Work All are about either: Structured topologies (cliques, block-diagonals, hierarchies, random) Specific virus propagation models Static graphs R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991. A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks. Cambridge University Press, 2010. F. M. Bass. A new product growth for model consumer durables. Management Science, 15(5):215–227, 1969. D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real networks. ACM TISSEC, 10(4), 2008. D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, 2010. A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of epidemics. IEEE INFOCOM, 2005. Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free networks and the effective immunization. arXiv:cond-at/0305549 v2, Aug. 6 2003. H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000. H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer Lecture Notes in Biomathematics, 46, 1984. J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer viruses. IEEE Computer Society Symposium on Research in Security and Privacy, 1991. J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE Computer Society Symposium on Research in Security and Privacy, 1993. R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters 86, 14, 2001. ……… SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

How should the answer look like? Answer should depend on: Graph Virus Propagation Model (VPM) But how?? Graph – average degree? max. degree? diameter? VPM – which parameters? How to combine – linear? quadratic? exponential? ….. SUM'13 C. Faloutsos (CMU)

Static Graphs: Our Main Result Informally, w/ Deepay Chakrabarti For, any arbitrary topology (adjacency matrix A) any virus propagation model (VPM) in standard literature the epidemic threshold depends only on the λ, first eigenvalue of A, and some constant , determined by the virus propagation model λ No epidemic if λ * < 1 In Prakash+ ICDM 2011 (Selected among best papers). SUM'13 C. Faloutsos (CMU)

Our thresholds for some models s = effective strength s < 1 : below threshold Models Effective Strength (s) Threshold (tipping point) SIS, SIR, SIRS, SEIR s = λ . s = 1 SIV, SEIV (H.I.V.) SUM'13 C. Faloutsos (CMU)

Our result: Intuition for λ “Official” definition: “Un-official” Intuition  Let A be the adjacency matrix. Then λ is the root with the largest magnitude of the characteristic polynomial of A [det(A – xI)]. Doesn’t give much intuition! λ ~ # paths in the graph u u ≈ . (i, j) = # of paths i  j of length k SUM'13 C. Faloutsos (CMU)

Largest Eigenvalue (λ) better connectivity higher λ λ ≈ 2 λ = N λ = N-1 λ ≈ 2 λ= 31.67 λ= 999 N = 1000 N nodes SUM'13 C. Faloutsos (CMU)

Examples: Simulations – SIR (mumps) Fraction of Infections Footprint (a) Infection profile (b) “Take-off” plot PORTLAND graph: synthetic population, 31 million links, 6 million nodes Effective Strength Time ticks SUM'13 C. Faloutsos (CMU)

Examples: Simulations – SIRS (pertusis) Fraction of Infections Footprint (a) Infection profile (b) “Take-off” plot PORTLAND graph: synthetic population, 31 million links, 6 million nodes Time ticks Effective Strength SUM'13 C. Faloutsos (CMU)

λ * < 1 See paper for full proof General VPM structure Model-based λ * < 1 Dimensional arguments… Graph-based Topology and stability SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

λ * < 1 See paper for full proof General VPM structure Model-based λ * < 1 Graph-based Topology and stability SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

Dynamic Graphs: Epidemic? Alternating behaviors DAY (e.g., work) adjacency matrix 8 SUM'13 C. Faloutsos (CMU)

Dynamic Graphs: Epidemic? Alternating behaviors NIGHT (e.g., home) adjacency matrix 8 SUM'13 C. Faloutsos (CMU)

Model Description SIS model Set of T arbitrary graphs recovery rate δ Infected Healthy X N1 N3 N2 Prob. β Prob. δ SIS model recovery rate δ infection rate β Set of T arbitrary graphs day N night N , weekend….. SUM'13 C. Faloutsos (CMU)

Our result: Dynamic Graphs Threshold Informally, NO epidemic if eig (S) = < 1 Single number! Largest eigenvalue of The system matrix S Details S = In Prakash+, ECML-PKDD 2010 SUM'13 C. Faloutsos (CMU)

Infection-profile Synthetic MIT Reality Mining log(fraction infected) ABOVE ABOVE AT AT BELOW BELOW Time SUM'13 C. Faloutsos (CMU)

“Take-off” plots Synthetic MIT Reality Our threshold Our threshold Footprint (# infected @ “steady state”) Synthetic MIT Reality EPIDEMIC Our threshold Our threshold EPIDEMIC NO EPIDEMIC NO EPIDEMIC (log scale) SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

Competing Contagions iPhone v Android Blu-ray v HD-DVD Biological common flu/avian flu, pneumococcal inf etc SUM'13 C. Faloutsos (CMU)

A simple model Virus 2 Virus 1 Details Modified flu-like Mutual Immunity (“pick one of the two”) Susceptible-Infected1-Infected2-Susceptible Virus 1 Virus 2 SUM'13 C. Faloutsos (CMU)

Question: What happens in the end? green: virus 1 red: virus 2 Number of Infections Footprint @ Steady State = ? ASSUME: Virus 1 is stronger than Virus 2 SUM'13 C. Faloutsos (CMU)

Question: What happens in the end? Footprint @ Steady State green: virus 1 red: virus 2 Number of Infections Strength ?? = 2 Strength ASSUME: Virus 1 is stronger than Virus 2 SUM'13 C. Faloutsos (CMU)

Answer: Winner-Takes-All green: virus 1 red: virus 2 Number of Infections ASSUME: Virus 1 is stronger than Virus 2 SUM'13 C. Faloutsos (CMU)

Our Result: Winner-Takes-All Given our model, and any graph, the weaker virus always dies-out completely Details The stronger survives only if it is above threshold Virus 1 is stronger than Virus 2, if: strength(Virus 1) > strength(Virus 2) Strength(Virus) = λ β / δ  same as before! In Prakash, Beutel, + WWW 2012 SUM'13 C. Faloutsos (CMU)

Real Examples [Google Search Trends data] Reddit v Digg Blu-Ray v HD-DVD SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) SUM'13 C. Faloutsos (CMU)

Full Static Immunization Given: a graph A, virus prop. model and budget k; Find: k ‘best’ nodes for immunization (removal). ? ? k = 2 ? ? SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) Full Immunization (Static Graphs) Fractional Immunization SUM'13 C. Faloutsos (CMU)

Challenges Given a graph A, budget k, Q1 (Metric) How to measure the ‘shield-value’ for a set of nodes (S)? Q2 (Algorithm) How to find a set of k nodes with highest ‘shield-value’? SUM'13 C. Faloutsos (CMU)

Proposed vulnerability measure λ λ is the epidemic threshold “Safe” “Vulnerable” “Deadly” Increasing λ Increasing vulnerability SUM'13 C. Faloutsos (CMU)

A1: “Eigen-Drop”: an ideal shield value Eigen-Drop(S) Δ λ = λ - λs 9 9 Δ 9 11 10 10 1 1 6 2 4 4 8 8 2 3 7 3 7 5 5 6 Original Graph Without {2, 6} SUM'13 C. Faloutsos (CMU)

(Q2) - Direct Algorithm too expensive! Immunize k nodes which maximize Δ λ S = argmax Δ λ Combinatorial! Complexity: Example: 1,000 nodes, with 10,000 edges It takes 0.01 seconds to compute λ It takes 2,615 years to find 5-best nodes! SUM'13 C. Faloutsos (CMU)

A2: Our Solution Part 1: Shield Value Part 2: Algorithm Carefully approximate Eigen-drop (Δ λ) Matrix perturbation theory Part 2: Algorithm Greedily pick best node at each step Near-optimal due to submodularity NetShield (linear complexity) O(nk2+m) n = # nodes; m = # edges In Tong, Prakash+ ICDM 2010 SUM'13 C. Faloutsos (CMU)

Experiment: Immunization quality Log(fraction of infected nodes) PageRank Betweeness (shortest path) Degree Lower is better Acquaintance Eigs (=HITS) NetShield Time SUM'13 C. Faloutsos (CMU)

Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) Full Immunization (Static Graphs) Fractional Immunization SUM'13 C. Faloutsos (CMU)

Fractional Immunization of Networks B. Aditya Prakash, Lada Adamic, Theodore Iwashyna (M.D.), Hanghang Tong, Christos Faloutsos Under review SUM'13 C. Faloutsos (CMU)

Fractional Asymmetric Immunization Drug-resistant Bacteria (like XDR-TB) Hospital Another Hospital SUM'13 C. Faloutsos (CMU)

Fractional Asymmetric Immunization Drug-resistant Bacteria (like XDR-TB) Hospital Another Hospital SUM'13 C. Faloutsos (CMU)

Fractional Asymmetric Immunization Problem: Given k units of disinfectant, how to distribute them to maximize hospitals saved? Hospital Another Hospital SUM'13 C. Faloutsos (CMU)

Our Algorithm “SMART-ALLOC” ~6x fewer! [US-MEDICARE NETWORK 2005] Each circle is a hospital, ~3000 hospitals More than 30,000 patients transferred CURRENT PRACTICE SMART-ALLOC SUM'13 C. Faloutsos (CMU)

≈ Running Time Wall-Clock Time > 30,000x speed-up! Lower is better > 1 week ≈ > 30,000x speed-up! Lower is better 14 secs Simulations SMART-ALLOC SUM'13 C. Faloutsos (CMU)

Experiments Lower is better SECOND-LIFE PENN-NETWORK K = 200 K = 2000 SUM'13 C. Faloutsos (CMU)

Acknowledgements Funding SUM'13 C. Faloutsos (CMU)

References http://www.cs.vt.edu/~badityap/ Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B. Aditya Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) - In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.) Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B. Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point (Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina Eliassi- Rad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under Submission http://www.cs.vt.edu/~badityap/ SUM'13 C. Faloutsos (CMU)

Propagation on Large Networks B. Aditya Prakash Christos Faloutsos Analysis Policy/Action Data SUM'13 C. Faloutsos (CMU)