Download presentation

Presentation is loading. Please wait.

Published byLesly Plume Modified over 2 years ago

1
**DAVA: Distributing Vaccines over Networks under Prior Information**

Yao Zhang, B. Aditya Prakash Department of Computer Science Virginia Tech SDM, Philadelphia, April 24, 2014 Zhang and Prakash, SDM 2014

2
**Motivation: Epidemiology**

Virus spreads over contact networks SIR model [Anderson+ 1991] Susceptible-Infectious-Recovered Weights pij: propagation prob. from i to j Recovered prob. δ for each node (models mumps-like infections) Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

3
**Motivation: Social Media**

Meme/Rumor spreads over friendship networks E.g.: Twitter following network Independent cascade model (IC) [Kempe+ KDD2003] Each node has only one chance to infect its neighbors Special case of SIR model Zhang and Prakash, SDM2014

4
Immunization Centers for Disease Control (CDC) cares about containing epidemic diseases E.g: ~400 million dollars used for vaccines for children in 2013 Twitter tries to stop rumor spread E.g.: rumors of victims after the Boston Marathon bombs in 2013 How to choose best nodes to vaccinate (remove)? Zhang and Prakash, SDM2014

5
**Immunization Good for baseline strategies**

Pre-emptive immunization (choose nodes before the epidemic starts) Acquaintance strategy [Cohen+ 2003] pick a random person, immunize one of its neighbors at random Netshield [Tong+ 2010] Minimize the epidemic threshold (point when the virus takes-off) Good for baseline strategies Zhang and Prakash, SDM2014

6
**In reality ? Typically the epidemic has already started! this paper**

Pre-emptive immunization (choose nodes before the epidemic starts) Acquaintance strategy [Cohen+ 2003] Netshield [Tong+ 2010] ? Typically the epidemic has already started! More realistic intervention Which nodes to vaccinate now? We call it Data-Aware Immunization this paper Zhang and Prakash, SDM2014

7
**Outline Motivation Problem Definition Complexity Our Proposed Methods**

Experiments Conclusion Zhang and Prakash, SDM2014

8
**Data-Aware Vaccination Problem**

Problem: Given a set of infected nodes and a contact graph, how to distribute k vaccines (node removal) to minimize the expected number of infected nodes at the end of the epidemic? D D Best solution A A E E B B 1 vaccine? F F C Remove A, save {A, D}; Remove B, save {B}; Remove C, save {C}; C pij =1 for all edges Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

9
**Outline Motivation Problem Definition Complexity Our Proposed Methods**

Experiments Conclusion Zhang and Prakash, SDM2014

10
**Complexity of DAV NP-hard Approximation algorithm?**

See paper for details NP-hard Reduce from Maximum K-Intersection Problem (MaxKI: maximizing the intersection of k subsets) MaxKI is NP-Complete [Vinterbo 2004] Approximation algorithm? Not submodular Actually, DAV is hard to approximate within an absolute error! Zhang and Prakash, SDM2014

11
**Outline Motivation Problem Definition Complexity Our Proposed Methods**

assume IC model and undirected graph Experiments Conclusion Zhang and Prakash, SDM2014

12
**1: Simplify - Merging infected nodes**

Idea: merge all the infected nodes into a single ‘super infected’ node I Original Graph Merged Graph Super node I A A pA pA Equivalent pX B B pB pY Logical-OR pB=1-(1-pX)(1-pY) pC pC C C Zhang and Prakash, SDM2014

13
**2: DAVA-Tree Algorithm: Idea**

Select nodes with the largest “benefit” : the expected number of saved nodes after removing set S on graph G Benefit of adding additional node j into S: # of saved nodes after adding j into S Merged Infected Node Additional number of saved nodes when adding node j into S Benefit: 5 Benefit: 4 pij =1for all edges Benefit: 2 Zhang and Prakash, SDM2014

14
**DAVA-Tree Alg.: Optimal on Trees**

For any set S: Merged Infected Node Fact 1: the chosen nodes in the optimal set must be neighbors of infected node I Fact 2: the benefit of each such node is independent of the rest of the set S Benefit: 2 Benefit: 5 pij =1for all edges Linear Time Benefit: 4 DAVA-tree algorithm: Select top k node from I’s neighbors with the max. benefit Zhang and Prakash, SDM2014

15
**3: General Case – Arbitrary Graphs**

Idea We have the optimal algorithm for a tree Extract a spanning tree, then run DAVA-tree What kind of tree? Minimum spanning tree Optimal on MST by DAVA-tree Optimal solution Dom captures the ‘closeness’ of nodes to the infectious nodes, and importance of saving nodes. MST pij =1 for all edges Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

16
**3: General Case – Arbitrary Graphs**

Idea We have the optimal algorithm for a tree Build a spanning tree first What kind of tree? Minimum spanning tree Software engineering We propose to use dominator tree u dominates v Dom captures the ‘closeness’ of nodes to the infectious nodes, and importance of saving nodes. every path from I to v contains u 4 dominates 8,9,10,11 pij =1 for all edges Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

17
**Dominator Tree u is immediate dominator of v**

u dominates v AND every other dominator of v dominates u Dominator tree: add an edge between every such u and v Optimal from DAVA-tree Optimal solution Linear time [Buchsbaum, Tarjan 1998] pij =1 for all edges Dominator Tree Merged Graph Fact 1: the optimal solution should be among the children of root I in the dominator tree for any arbitrary graph Fact 2: (for special case, k = 1, p = 1) running DAVA-tree on the dominator tree gives the optimal solution Zhang and Prakash, SDM2014

18
**Weighting the dominator tree**

#P-complete Our solution: maximum propagation path probability between nodes I and v (using Dijkstra’s algorithm) w1 p1 p3 w3 p6 w6 Dominator Tree Merged Graph Zhang and Prakash, SDM2014

19
**DAVA algorithm Step: 1. T = Build a dominator tree**

Merged Graph (pij =1 for all edges) Step: 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k Not finished |S|=2 Iteration=1 Dominator Tree Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

20
**DAVA algorithm Step: 1. T = Build a dominator tree**

Merged Graph Step: 1. T = Build a dominator tree 2. v = Run DAVA-tree on T with budget=1 3. Remove v from G 4. Goto Step 1 until |S|=k Remove selected node O(k(|E|+ |V|log|V|)) Too slow for large networks! Dominator tree Not finished |S|=2 Iteration=2 Iteration=1 Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

21
**DAVA-fast: a faster algorithm**

Step: 1. T = Build a dominator tree 2. S = Run DAVA-tree on T with budget=k Merged Graph |S|=2 In practice, the performance of DAVA-fast is very close to DAVA Time complexity: subquadratic! DAVA-fast: O(|V|log|V|+|E|) Note finished Dominator tree Zhang and Prakash, SDM2014 Zhang and Prakash, SDM 2014

22
Extending to SIR model See the paper Zhang and Prakash, SDM2014

23
**Outline Motivation Problem Definition Complexity Our Proposed Methods**

Experiments Conclusion Zhang and Prakash, SDM2014

24
**Experiments Virus Propagation Model**

IC and SIR Settings (See more settings in the paper) Randomly uniformly chosen initial infected nodes Baseline Algorithms RANDOM: randomly uniformly chosen healthy nodes DEGREE: choose nodes with top weighted degrees PAGERANK: choose nodes with top pageranks NETSHIELD state-of-the-art pre-emptive immunization algorithm to minimize the epidemic threshold of the graph [Tong+ ICDM 2010] Assumes no data is given before the epidemic starts Zhang and Prakash, SDM2014

25
**Experiments: datasets**

Datasets are chosen from different domains Social media (IC model) OREGON: AS router graph STANFORD: hyperlink network GNUTELLA: peer-to-peer network BRIGHTKITE: friendship network Epidemiology (SIR model) PORTLAND and MIAMI: large urban social-contact graph used in national smallpox modeling studies [Eubank+, 2004] OREGON STANFORD GNUTELLA BRIGHTKITE PORTLAND MIAMI |V| 633 8,929 10,876 58,228 0.5 million 0.6 million |E| 2,172 53,829 39,994 21,4078 1.6 million 2.1 million Zhang and Prakash, SDM2014

26
**Experiments: Quality GNUTELLA (IC model) PORTLAND (SIR model)**

Higher is better DAVA consistently outperforms the baseline algorithms. Further DAVA-fast performs almost as well as DAVA. (See more results in the paper) Zhang and Prakash, SDM2014

27
**Experiments: Scalability**

did not finish within 10 hours Running time(sec.) Lower is better Zhang and Prakash, SDM2014

28
**Outline Motivation Problem Definition Complexity Our Proposed Methods**

Experiments Conclusion Zhang and Prakash, SDM2014

29
**Conclusion Data-Aware Vaccination problem**

Given: Graph and Infected nodes Find: ‘best’ nodes for immunization Complexity NP-hard Hard to approximate within an absolute error DAVA-tree Optimal solution on the tree DAVA and DAVA-fast Merging infected nodes Build a dominator tree, and run DAVA-tree Running time: subquadratic DAVA: O(k(|E|+ |V|log|V|)) DAVA-fast: O(|E|+|V|log|V|) Graph with infected nodes Merged graph Dominator tree Zhang and Prakash, SDM2014

30
**Any Questions? Code at: http://people.cs.vt.edu/~yaozhang Yao Zhang**

Graph with infected nodes Code at: Merged graph Yao Zhang B. Aditya Prakash Dominator tree Thanks for the support of NSF (Grant No. IIS ). Zhang and Prakash, SDM2014

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google