Presentation is loading. Please wait.

Presentation is loading. Please wait.

Controlling Propagation at Group Scale on Networks Yao Zhang*, Abhijin Adiga +, Anil Vullikanti + *, and B. Aditya Prakash* *Department of Computer Science.

Similar presentations


Presentation on theme: "Controlling Propagation at Group Scale on Networks Yao Zhang*, Abhijin Adiga +, Anil Vullikanti + *, and B. Aditya Prakash* *Department of Computer Science."— Presentation transcript:

1 Controlling Propagation at Group Scale on Networks Yao Zhang*, Abhijin Adiga +, Anil Vullikanti + *, and B. Aditya Prakash* *Department of Computer Science + NDSSL, Virginia Bioinformatics Institute Virginia Tech ICDM, Atlantic City, November 17 th, 2015

2 Outline 2  Motivation  Problem Formulation  Our Proposed Methods  Experiments  Conclusion ZAVP, ICDM 2015

3 3 Epidemiology: disease spreads over contact networks Social Media: Information spreads over friendship networks [2014 Week 51 Flu spread in US from CDC] [from forbes.com] Propagation over networks Flu ZAVP, ICDM 2015 Meme

4 Immunization 4  Epidemiology  Centers for Disease Control (CDC)  Contain epidemic diseases  Social Media  Facebook, Twitter,...  How to stop rumor spread Immunization problem: How to control propagation over networks? ZAVP, ICDM 2015 Flu Meme

5 Immunization: two interventions 5 Two popular interventions  Vaccination:  Quarantining: We do both vaccination and quarantining! ZAVP, ICDM 2015 remove node remove edge

6 Background: Individual based immunization 6 Problem: find best nodes/edges to remove to control propagation over networks Popular individual based immunization strategies:  For threshold models [Khalil+ KDD2015]  E.g., LT model  For cascade style models [Tong+ CIKM2012, Tong+ ICDM2010]  E.g., SIR/SIS/IC model Which node to remove ? Example: ZAVP, ICDM 2015

7 In reality 7  Sometimes individual immunization cannot be easily turned into implementable policies  E.g., Hard to ensure specific individuals take the adequate vaccine vaccination ZAVP, ICDM 2015

8 In reality 8  Sometimes individual immunization cannot be easily turned into implementable policies  E.g., Hard to ensure specific individuals take the adequate vaccine  Observation: Groups naturally exist in underlying networks  People can be grouped by ages, demographics, occupations, …  Friends are grouped by the same interests, geolocations, … Note: groups need NOT be just link- based communities Occupation Groups Geolocation Groups ZAVP, ICDM 2015

9 Immunization at group scale 9  More realistic:  Epidemiology: CDC distributes flu vaccines based on demographics, locations,...  Social media: easier to put a warning bulletin on group pages  Cheaper  Expensive to target individuals Hence, we study: ZAVP, ICDM 2015 How to select groups to control propagation over networks?

10 Outline 10  Motivation  Problem Formulation  Our Proposed Methods  Experiments  Conclusion ZAVP, ICDM 2015

11 Problem Formulation 11 How to formulate the problem (wish list):  Aim 1: usefulness  Model the process of group immunization  Aim 2: consistency  Generalize individual immunization to group immunization ZAVP, ICDM 2015

12 Aim 1: process of group immunization 12  Idea:  Distribute vaccines to groups  Randomly vaccinate/quarantine within groups simulate the vaccine distribution process in the real life: Decision maker (e.g., CDC) … School CommunityPlant … …… Give vaccines to groups People volunteerly take vaccines ZAVP, ICDM 2015

13 Group Immunization: how to do it 13 Distribute vaccines Budget: 3 : gets one : gets two : gets zero Randomly remove nodes … Quarantining (Edge removal) process is similar ZAVP, ICDM 2015  Idea:  Distribute vaccines to groups  Randomly vaccinate/quarantine within groups Example: vaccination (node removal) all possible worlds

14 Aim 2: from individual to group immunization 14 Which metrics to measure the quality of immunizations?  For threshold models  Metric: epidemic size (min.)  E.g., LT model  For cascade style models  Metric: spectral radius (min.)  E.g., SIS/SIR/IC model We do both for group immunization! ZAVP, ICDM 2015 … Expected quality over all possible worlds

15 Background: threshold based model 15 Rumor spreading ZAVP, ICDM 2015

16 Problem 1: edge deletion under LT model 16 Given: graph G(V,E), partition of node set C, infected node set A, budget m vaccines Find: the best allocation of vaccines to groups Such that: the final expected epidemic size is minimized after removing edges within groups Quality function: the expected number of infected nodes Allocation vector over groups Formally: ZAVP, ICDM 2015

17 Problem 2: node deletion under LT model 17 Given: graph G(V,E), partition of node set C, infected node set A, budget m vaccines Find: the best allocation of vaccines to groups Such that: the final expected epidemic size is minimized after removing nodes within groups Allocation vector over groups How to allocate three vaccines? Distribute vaccines Among groups : one : two : zero Formally: ZAVP, ICDM 2015 Quality function: the expected number of infected nodes

18 Background: cascade style model 18  Epidemic threshold: spectral radius  The largest eigenvalue λ 1 of the adjacency matrix of a network  Connects to the reproduction number in epidemiology  Determines the phase-transition (‘epidemic threshold’) between epidemic/nonepidemic regimes  Cascade-style: SIR/SIS/IC model λ 1 is the epidemic threshold [Prakash+, ICDM 2011] ZAVP, ICDM 2015

19 Problem 3: edge deletion for spectral radius 19 Given: graph G(V,E), partition of node set C, budget m vaccines Find: the best allocation of vaccines to groups Such that: the expected drop of the first eigenvalue is maximized after removing edges within groups Formally: Quality function: the expected drop of the eigenvalue Allocation vector over groups ZAVP, ICDM 2015

20 Problem 4: node deletion for spectral radius 20 Given: graph G(V,E), partition of node set C, budget m vaccines Find: the best allocation of vaccines to groups Such that: the expected drop of the first eigenvalue is maximized after removing nodes within groups Formally: Quality function: the expected drop of the eigenvalue Allocation vector over groups ZAVP, ICDM 2015 … Expected quality (Eigendrop) over all possible worlds

21 Hardness of our problems ZAVP, ICDM 201521  Individual based vs. group based immunization  If each node is equal to a group, our problems can be exactly reduced to individual based immunization problems:  P1 and P2 reduce to [Khalil+ KDD2014]  P3 and P4 reduce to [Tong+ CIKM2012, Tong+ ICDM2010]  Our problem: even harder All are NP-hard problems

22 Outline 22  Motivation  Problem Definition  Our Proposed Methods  Problem 1 and 2 (LT model/epidemic size)  Problem 3 and 4 (Cascade style/spectral radius)  Experiments  Conclusion ZAVP, ICDM 2015

23 Prob. 1: edge removal under LT model 23  Formally, our problem is  We rewrite it as  Hence we want to maximize f(x)  Note:  x is a vector  f(x) is not a function over sets, but a function over integer lattice the expected number of infected nodes after vaccines are allocated the expected number nodes SAVED after vaccines are allocated according to x ZAVP, ICDM 2015

24 Main idea: Diminishing returns over lattices 24  Result 1: we prove that has the following three properties:  P1: and  P2: (non-decreasing)  P3: (diminishing returns) if, then  Greedy algorithm: Greedy-LT  each time give one vaccine to a group i with max. marginal gain  Result 2: we prove that our algorithm provides (1-1/e)-approximation See paper for details Note: having diminishing return property is not equivalent to submodularity over integer lattice ZAVP, ICDM 2015

25 Prob. 2: node removal under LT model 25 Result:  The number of nodes saved after removing nodes within groups, also have the three properties:  P1: and  P2: (non-decreasing)  P3: (diminishing returns) if, then  Use a similar greedy algorithm with (1-1/e)- approximate guarantee the expected number of infected nodes after vaccines are allocated ZAVP, ICDM 2015

26 Outline 26  Motivation  Problem Definition  Our Proposed Methods  Problem 1 and 2 (LT model/epidemic size)  Problem 3 and 4 (Cascade style/spectral radius)  Experiments  Conclusion ZAVP, ICDM 2015

27 Prob. 3: edge removal for spectral radius 27  Formally, we want  Idea: stochastic process  Define the expected adjacency matrix of the graph  Instead of maximize, minimize, the first eigenvalue of the expected adjacency matrix of the graph  The allocation of x by minimizing can be obtained by solving a semi-definite program (SDP)  An approximation guarantee: give a constant factor of  Slow: running time O(|V| 4 ) Prob. that each edge is preserved ZAVP, ICDM 2015 the expected drop of the eigenvalue

28 28 Another method: matrix perturbation theory  the expected drop of the first eigenvalue can be estimated as:  x can be solved using Linear Programming (LP)  Faster: O(n 4 )  n: number of groups (much smaller compared to the size of graph) Proportion of edges been removed in group a Mu = λ. u See paper for details uiui Prob. 3: edge removal for spectral radius ZAVP, ICDM 2015

29 Prob. 4: node removal for spectral radius 29  Idea: using matrix perturbation theory  similar to LP, the expected drop of the first eigenvalue can be estimated as  Allocation x can be obtained using Quadratic Programming (QP)  Fast: O(n 4 )  n: number of groups (much smaller compared to the size of graph) Quadratic function on x 29ZAVP, ICDM 2015

30 Summary of our methods 30 ProblemOur Methods Approx. guarantee Running Time P1, P2 (LT model) GreedyLT (1-1/e)- approx. O(mnL|V|) P3 (spectral radius ) SDP constant factor O(|V| 4 polylog(|V|)) P3 (spectral radius) LP heuristicO(n 4 ) P4 (spectral radius) QP heuristicO(n 4 ) m: number of vaccines (budget) n: number of groups V: node set L: simulation times for greedy algorithm ZAVP, ICDM 2015

31 Outline 31  Motivation  Problem Definition  Our Proposed Methods  Experiments  Conclusion ZAVP, ICDM 2015

32 Experiments: datasets 32  Different Domains with range of sizes  SBM: Stochastic Block Model  PROTEIN: protein-protein interaction network  OREGON: Oregon AS router graph  YOUTUBE: friendship network  PORTLAND and MIAMI: epidemiology contact network  Large urban social-contact graphs used in national smallpox modeling studies [Eubank+, 2004] Each dataset has its natural division of groups SBMPROTEINOREGONYOUTUBEPORTLANDMIAMI |V|1,5002,36110K50K0.5 million0.6 million |E|5,0007,18222K450K1.6 million2.1 million Group201331500091 ZAVP, ICDM 2015

33 Experiments: datasets 33  Baselines  RANDOM  uniformly randomly assign vaccines to groups  DEGREE  independently assign vaccines to groups based on their average degree of the groups  EIGEN  independently assign vaccines to groups based on their average eigenscore of the groups ZAVP, ICDM 2015

34 Results: Effectiveness (P1, P2) 34 P1/edge: YOUTUBE P2/node: PORTLAND GREEDY-LT consistently outperforms the baseline algorithms. 25K nodes Lower is better Ratio of Infected Nodes ZAVP, ICDM 2015 Our method

35 Results: Effectiveness (P3, P4) 35 P1/edge: PROTEINP2/node: PORTLAND SDP, LP and QP consistently outperform the baseline algorithms. Lower is better Ratio of EigenDrop ZAVP, ICDM 2015 Our methodsOur method

36 Results: Varying Num. of Group 36 Our algorithms consistently outperform other baseline algorithms as the number of groups changes P2/node: YOUTUBE Lower is better ZAVP, ICDM 2015 Our method P4/node: PORTLAND

37 Result: Case Study: age group 37 PORTLAND Observations: 1.our methods choose elder people; 2.other methods tend to uniformly distribute vaccines. The results match the current practice that CDC targets vulnerable people Vaccine Distributions for P4 on realistic epi. networks (Budget=10000). ZAVP, ICDM 2015 MIAMI

38 Outline 38  Motivation  Problem Definition  Our Proposed Methods  Experiments  Conclusion ZAVP, ICDM 2015

39 Conclusion: Group Immunization 39  Problem formulations  Group immunization policy  Select groups to distribute vaccines  Randomly remove edge/node from groups  Edge deletion and node deletion  Minimize the epidemic size and the spectral radius  Near-optimal algorithms  Greedy algorithm under LT model  (1-1/e)-approximation  Edge deletion for min. spectral radius  SDP: good approximation but slow  LP: fast  Node deletion for min. spectral radius  QP: fast ZAVP, ICDM 2015

40 Any questions? 40 Code at: http://people.cs.vt.edu/~yaozhang Funding: Yao ZhangB. Aditya PrakashAbhijin AdigaAnil Vullikanti ZAVP, ICDM 2015

41 Backup slides ZAVP, ICDM 201541

42 Why not epi size for cascade model ZAVP, ICDM 201542  We do not specifically use IC model for primarily two reasons:  Spectral radius naturally generalize the corresponding individual-level immunization problems studied in past literature ([Tong+ ICDM2010], [Tong+ CIKM2012])  Using the spectral radius allows us to immediately formulate a general problem for multiple cascade-style models (like SIR/SIS/IC)  We can ignore the differences of their exact spreading process

43 Submodular over integer lattice ZAVP, ICDM 201543  [Soma+ ICML2014] a function f over integer lattice is submodular if:  For an element s in a vector  Different from the diminishing return property in our paper


Download ppt "Controlling Propagation at Group Scale on Networks Yao Zhang*, Abhijin Adiga +, Anil Vullikanti + *, and B. Aditya Prakash* *Department of Computer Science."

Similar presentations


Ads by Google