Download presentation

Presentation is loading. Please wait.

Published byLesly Plume Modified about 1 year ago

1
DAVA: Distributing Vaccines over Networks under Prior Information Yao Zhang, B. Aditya Prakash Department of Computer Science Virginia Tech SDM, Philadelphia, April 24, 2014

2
Motivation: Epidemiology Virus spreads over contact networks SIR model [Anderson+ 1991] Susceptible-Infectious-Recovered Weights p ij : propagation prob. from i to j Recovered prob. δ for each node (models mumps-like infections) 2 Zhang and Prakash, SDM2014

3
Motivation: Social Media Meme/Rumor spreads over friendship networks E.g.: Twitter following network Independent cascade model (IC) [Kempe+ KDD2003] Each node has only one chance to infect its neighbors Special case of SIR model 3 Zhang and Prakash, SDM2014

4
Immunization Centers for Disease Control (CDC) cares about containing epidemic diseases E.g: ~400 million dollars used for vaccines for children in 2013 Twitter tries to stop rumor spread E.g.: rumors of victims after the Boston Marathon bombs in Zhang and Prakash, SDM2014 How to choose best nodes to vaccinate (remove)?

5
Immunization 5 Zhang and Prakash, SDM2014 Pre-emptive immunization (choose nodes before the epidemic starts) Acquaintance strategy [Cohen+ 2003] pick a random person, immunize one of its neighbors at random Netshield [Tong+ 2010] Minimize the epidemic threshold (point when the virus takes-off) Good for baseline strategies

6
In reality Typically the epidemic has already started! More realistic intervention Which nodes to vaccinate now? We call it Data-Aware Immunization 6 this paper Zhang and Prakash, SDM2014 Pre-emptive immunization (choose nodes before the epidemic starts) Acquaintance strategy [Cohen+ 2003] Netshield [Tong+ 2010] ?

7
Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion 7 Zhang and Prakash, SDM2014

8
Data-Aware Vaccination Problem Problem: Given a set of infected nodes and a contact graph, how to distribute k vaccines (node removal) to minimize the expected number of infected nodes at the end of the epidemic? 1 vaccine? 8 p ij =1 for all edges Best solution A B C C B A Remove A, save {A, D}; Remove B, save {B}; Remove C, save {C}; Zhang and Prakash, SDM2014 F E D E F D

9
Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion 9 Zhang and Prakash, SDM2014

10
Complexity of DAV 10 NP-hard Reduce from Maximum K-Intersection Problem (MaxKI: maximizing the intersection of k subsets) MaxKI is NP-Complete [Vinterbo 2004] Approximation algorithm? Not submodular Actually, DAV is hard to approximate within an absolute error! See paper for details Zhang and Prakash, SDM2014

11
Outline Motivation Problem Definition Complexity Our Proposed Methods assume IC model and undirected graph Experiments Conclusion 11 Zhang and Prakash, SDM2014

12
1: Simplify - Merging infected nodes Idea: merge all the infected nodes into a single ‘super infected’ node I 12 pXpX pYpY pBpB Logical-OR p B =1-(1-p X )(1-p Y ) pApA pCpC pApA pCpC Equivalent Merged Graph Original Graph A B C A B C Zhang and Prakash, SDM2014 Super node I

13
2: DAVA-Tree Algorithm : Idea Select nodes with the largest “benefit” : the expected number of saved nodes after removing set S on graph G Benefit of adding additional node j into S: Merged Infected Node Benefit: 4 Benefit: 2 Benefit: 5 13 p ij =1for all edges Additional number of saved nodes when adding node j into S # of saved nodes after adding j into S Zhang and Prakash, SDM2014

14
DAVA-Tree Alg.: Optimal on Trees Fact 1: the chosen nodes in the optimal set must be neighbors of infected node I Benefit: 4 Benefit: 2Benefit: 5 14 Fact 2: the benefit of each such node is independent of the rest of the set S DAVA-tree algorithm: Select top k node from I’s neighbors with the max. benefit p ij =1for all edges Merged Infected Node Linear Time Zhang and Prakash, SDM2014 For any set S:

15
Idea We have the optimal algorithm for a tree Extract a spanning tree, then run DAVA-tree What kind of tree? Minimum spanning tree 15 3: General Case – Arbitrary Graphs p ij =1 for all edges Optimal solution MST Optimal on MST by DAVA-tree Zhang and Prakash, SDM2014

16
Idea We have the optimal algorithm for a tree Build a spanning tree first What kind of tree? Minimum spanning tree 16 3: General Case – Arbitrary Graphs We propose to use dominator tree u dominates v every path from I to v contains u 4 dominates 8,9,10,11 p ij =1 for all edges Software engineering Zhang and Prakash, SDM2014

17
Dominator Tree Merged Graph Dominator Tree Linear time [Buchsbaum, Tarjan 1998] Optimal from DAVA-tree 17 u dominates v AND every other dominator of v dominates u u is immediate dominator of v Dominator tree: add an edge between every such u and v Optimal solution p ij =1 for all edges Fact 1: the optimal solution should be among the children of root I in the dominator tree for any arbitrary graph Fact 2: (for special case, k = 1, p = 1) running DAVA-tree on the dominator tree gives the optimal solution Zhang and Prakash, SDM2014

18
Weighting the dominator tree #P-complete Our solution: maximum propagation path probability between nodes I and v (using Dijkstra’s algorithm) 18 Merged Graph Dominator Tree Zhang and Prakash, SDM2014 p1p1 p6p6 p3p3 w1w1 w6w6 w3w3

19
DAVA algorithm 19 |S|=2 Iteration=1 Merged Graph (p ij =1 for all edges) Dominator Tree Step : 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k Zhang and Prakash, SDM2014

20
DAVA algorithm Step : 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k 20 O(k(|E|+ |V|log|V|)) Too slow for large networks! Remove selected node Dominator tree |S|=2 Iteration=2 Merged Graph Iteration=1 Zhang and Prakash, SDM2014

21
DAVA-fast: a faster algorithm 21 Time complexity: subquadratic! –DAVA-fast: O(|V|log|V|+|E|) Step: 1. T = Build a dominator tree 2. S = Run DAVA-tree on T with budget=k |S|=2 In practice, the performance of DAVA-fast is very close to DAVA Dominator tree Merged Graph Zhang and Prakash, SDM2014

22
Extending to SIR model See the paper 22 Zhang and Prakash, SDM2014

23
Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion 23 Zhang and Prakash, SDM2014

24
Experiments Virus Propagation Model IC and SIR Settings (See more settings in the paper) Randomly uniformly chosen initial infected nodes Baseline Algorithms RANDOM: randomly uniformly chosen healthy nodes DEGREE: choose nodes with top weighted degrees PAGERANK: choose nodes with top pageranks NETSHIELD state-of-the-art pre-emptive immunization algorithm to minimize the epidemic threshold of the graph [Tong+ ICDM 2010] Assumes no data is given before the epidemic starts 24 Zhang and Prakash, SDM2014

25
Experiments: datasets Datasets are chosen from different domains Social media (IC model) OREGON: AS router graph STANFORD: hyperlink network GNUTELLA: peer-to-peer network BRIGHTKITE: friendship network Epidemiology (SIR model) PORTLAND and MIAMI: large urban social-contact graph used in national smallpox modeling studies [Eubank+, 2004] 25 OREGONSTANFORDGNUTELLABRIGHTKITEPORTLANDMIAMI |V|6338,92910,87658, million0.6 million |E|2,17253,82939,99421, million2.1 million Zhang and Prakash, SDM2014

26
Experiments: Quality GNUTELLA (IC model) PORTLAND (SIR model) DAVA consistently outperforms the baseline algorithms. Further DAVA-fast performs almost as well as DAVA. 26 (See more results in the paper) Higher is better Zhang and Prakash, SDM2014

27
Experiments: Scalability 27 did not finish within 10 hours Running time(sec.) Lower is better Zhang and Prakash, SDM2014

28
Outline Motivation Problem Definition Complexity Our Proposed Methods Experiments Conclusion 28 Zhang and Prakash, SDM2014

29
Conclusion 29 Dominator tree Merged graph Graph with infected nodes Data-Aware Vaccination problem Given: Graph and Infected nodes Find: ‘best’ nodes for immunization Complexity NP-hard Hard to approximate within an absolute error DAVA-tree Optimal solution on the tree DAVA and DAVA-fast Merging infected nodes Build a dominator tree, and run DAVA-tree Running time: subquadratic DAVA: O(k(|E|+ |V|log|V|)) DAVA-fast: O(|E|+|V|log|V|) Zhang and Prakash, SDM2014

30
30 Any Questions? Code at: Thanks for the support of NSF (Grant No. IIS ). Yao Zhang B. Aditya Prakash Zhang and Prakash, SDM2014 Dominator tree Merged graph Graph with infected nodes

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google