Presentation is loading. Please wait.

Presentation is loading. Please wait.

Anomaly Detection and Virus Propagation in Large Graphs

Similar presentations


Presentation on theme: "Anomaly Detection and Virus Propagation in Large Graphs"— Presentation transcript:

1 Anomaly Detection and Virus Propagation in Large Graphs
Christos Faloutsos CMU

2 Faloutsos, Prakash, Chau, Koutra, Akoglu
Thank you! Dr. Ching-Hao (Eric) Mao Prof. Kenneth Pao Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

3 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Part 1: anomaly detection OddBall (anomaly detection) Belief Propagation Conclusions Part 2: influence propagation Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

4 OddBall: Spotting Anomalies in Weighted Graphs
Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School of Computer Science PAKDD 2010, Hyderabad, India

5 Faloutsos, Prakash, Chau, Koutra, Akoglu
Main idea For each node, extract ‘ego-net’ (=1-step-away neighbors) Extract features (#edges, total weight, etc etc) Compare with the rest of the population Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

6 Faloutsos, Prakash, Chau, Koutra, Akoglu
What is an egonet? egonet ego Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 6

7 Faloutsos, Prakash, Chau, Koutra, Akoglu
Selected Features Ni: number of neighbors (degree) of ego i Ei: number of edges in egonet i Wi: total weight of egonet i λw,i: principal eigenvalue of the weighted adjacency matrix of egonet I Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 7

8 Faloutsos, Prakash, Chau, Koutra, Akoglu
Near-Clique/Star SOME OLD RULES Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 8

9 Faloutsos, Prakash, Chau, Koutra, Akoglu
Near-Clique/Star SOME OLD RULES Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 9

10 Faloutsos, Prakash, Chau, Koutra, Akoglu
Near-Clique/Star SOME OLD RULES Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 10

11 Faloutsos, Prakash, Chau, Koutra, Akoglu
Near-Clique/Star Andrew Lewis (director) SOME OLD RULES Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu 11

12 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Part 1: anomaly detection OddBall (anomaly detection) Belief Propagation Conclusions Part 2: influence propagation Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

13 Faloutsos, Prakash, Chau, Koutra, Akoglu
E-bay Fraud detection w/ Polo Chau & Shashank Pandit, CMU [www’07] Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

14 Faloutsos, Prakash, Chau, Koutra, Akoglu
E-bay Fraud detection Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

15 Faloutsos, Prakash, Chau, Koutra, Akoglu
E-bay Fraud detection Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

16 E-bay Fraud detection - NetProbe
Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

17 Faloutsos, Prakash, Chau, Koutra, Akoglu
Popular press And less desirable attention: from ‘Belgium police’ (‘copy of your code?’) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

18 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline OddBall (anomaly detection) Belief Propagation Ebay fraud Symantec malware detection Unification results Conclusions Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

19 Polonium: Tera-Scale Graph Mining and Inference for Malware Detection
PATENT PENDING Polonium: Tera-Scale Graph Mining and Inference for Malware Detection SDM 2011, Mesa, Arizona Polo Chau Machine Learning Dept Carey Nachenberg Vice President & Fellow Jeffrey Wilhelm Principal Software Engineer Adam Wright Software Engineer Prof. Christos Faloutsos Computer Science Dept

20 Faloutsos, Prakash, Chau, Koutra, Akoglu
Polonium: The Data 60+ terabytes of data anonymously contributed by participants of worldwide Norton Community Watch program 50+ million machines 900+ million executable files Constructed a machine-file bipartite graph (0.2 TB+) 1 billion nodes (machines and files) 37 billion edges As of today, has grown to more than three times Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

21 Faloutsos, Prakash, Chau, Koutra, Akoglu
Polonium: Key Ideas Use Belief Propagation to propagate domain knowledge in machine-file graph to detect malware Use “guilt-by-association” (i.e., homophily) E.g., files that appear on machines with many bad files are more likely to be bad Scalability: handles 37 billion-edge graph Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

22 Polonium: One-Interaction Results
Ideal 84.9% True Positive Rate 1% False Positive Rate True Positive Rate % of malware correctly identified for files reported by four or more machines False Positive Rate % of non-malware wrongly labeled as malware Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

23 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Part 1: anomaly detection OddBall (anomaly detection) Belief Propagation Ebay fraud Symantec malware detection Unification results Conclusions Part 2: influence propagation Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

24 Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms
Danai Koutra Tai-You Ke U Kang Duen Horng (Polo) Chau Hsing-Kuo Kenneth Pao Christos Faloutsos Work in collaboration with National Taiwan University ECML PKDD, 5-9 September 2011, Athens, Greece

25 Problem Definition: GBA techniques
? Given: Graph; & few labeled nodes Find: labels of rest (assuming network effects) ? ? Classification problem where we assume that neighboring nodes are related Network effects – birds of a feather flock together ? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

26 Homophily and Heterophily
NOT all methods handle heterophily BUT proposed method does! Step 1 All methods handle homophily Intensity of green related to the score of the node – birds of a feather flock together Eg. of network with heterophily: bad guys - fraudsters This relation can be either homophily or heterophily. All upcoming methods can handle homophily. Heterophily not all – proposed method CAN handle, others cannot Homophily: connected nodes are similar -> in the first step the neighbors of the green node will become green and the neighbors of the red node red. The same will happen in the second step – the color is diffused Heterophily: connected nodes are dissimilar -> In the first step the nodes connected to the green node become red and vice versa Step 2 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

27 Faloutsos, Prakash, Chau, Koutra, Akoglu
Are they related? RWR (Random Walk with Restarts) google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) minimize the differences among neighbors BP (Belief propagation) send messages to neighbors, on what you believe about them Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

28 Faloutsos, Prakash, Chau, Koutra, Akoglu
Are they related? YES! RWR (Random Walk with Restarts) google’s pageRank (‘if my friends are important, I’m important, too’) SSL (Semi-supervised learning) minimize the differences among neighbors BP (Belief propagation) send messages to neighbors, on what you believe about them Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

29 Correspondence of Methods
Matrix Unknown known RWR [I – c AD-1] × x = (1-c)y SSL [I + a(D - A)] y FABP [I + a D - c’A] bh φh ? 1 d1 d2 d3 final labels/ beliefs prior labels/ beliefs adjacency matrix Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

30 Faloutsos, Prakash, Chau, Koutra, Akoglu
Results: Scalability # of edges (Kronecker graphs) runtime (min) Kronecker graphs FABP is linear on the number of edges. Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

31 Results (5): Parallelism
runtime (min) % accuracy FABP ~2x faster & wins/ties on accuracy. Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

32 Faloutsos, Prakash, Chau, Koutra, Akoglu
Conclusions Anomaly detection: hand-in-hand with pattern discovery (‘anomalies’ == ‘rare patterns’) ‘OddBall’ for large graphs ‘NetProbe’ and belief propagation: exploit network effects. FaBP: fast & accurate Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

33 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Part 1: anomaly detection OddBall (anomaly detection) Belief Propagation Conclusions Part 2: influence propagation Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

34 Influence propagation in large graphs - theorems and algorithms
B. Aditya Prakash Christos Faloutsos Carnegie Mellon University

35 Networks are everywhere!
Facebook Network [2010] Gene Regulatory Network [Decourty 2008] Human Disease Network [Barabasi 2007] The Internet [2005] Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

36 Dynamical Processes over networks are also everywhere!
Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

37 Faloutsos, Prakash, Chau, Koutra, Akoglu
Why do we care? Information Diffusion Viral Marketing Epidemiology and Public Health Cyber Security Human mobility Games and Virtual Worlds Ecology Social Collaboration Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

38 Why do we care? (1: Epidemiology)
Dynamical Processes over networks [AJPH 2007] CDC data: Visualization of the first 35 tuberculosis (TB) patients and their 1039 contacts Diseases over contact networks Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

39 Why do we care? (1: Epidemiology)
Dynamical Processes over networks Each circle is a hospital ~3000 hospitals More than 30,000 patients transferred [US-MEDICARE NETWORK 2005] Problem: Given k units of disinfectant, whom to immunize? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

40 Why do we care? (1: Epidemiology)
~6x fewer! [US-MEDICARE NETWORK 2005] CURRENT PRACTICE OUR METHOD Hospital-acquired inf. took 99K+ lives, cost $5B+ (all per year) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

41 Why do we care? (2: Online Diffusion)
> 800m users, ~$1B revenue [WSJ 2010] ~100m active users > 50m users Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

42 Why do we care? (2: Online Diffusion)
Dynamical Processes over networks Buy Versace™! Celebrity Followers Social Media Marketing Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

43 High Impact – Multiple Settings
epidemic out-breaks Q. How to squash rumors faster? Q. How do opinions spread? Q. How to market better? products/viruses transmit s/w patches Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

44 Research Theme ANALYSIS POLICY/ ACTION DATA Understanding Managing
Large real-world networks & processes Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

45 Faloutsos, Prakash, Chau, Koutra, Akoglu
In this talk ANALYSIS Understanding Given propagation models: Q1: Will an epidemic happen? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

46 Faloutsos, Prakash, Chau, Koutra, Akoglu
In this talk POLICY/ ACTION Managing Q2: How to immunize and control out-breaks better? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

47 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Part 1: anomaly detection Part 2: influence propagation Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

48 A fundamental question
Strong Virus Epidemic? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

49 example (static graph)
Weak Virus Epidemic? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

50 Faloutsos, Prakash, Chau, Koutra, Akoglu
Problem Statement above (epidemic) below (extinction) # Infected time Find, a condition under which virus will die out exponentially quickly regardless of initial infection condition Separate the regimes? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

51 Threshold (static version)
Problem Statement Given: Graph G, and Virus specs (attack prob. etc.) Find: A condition for virus extinction/invasion Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

52 Threshold: Why important?
Accelerating simulations Forecasting (‘What-if’ scenarios) Design of contagion and/or topology A great handle to manipulate the spreading Immunization Maximize collaboration ….. Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

53 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

54 “SIR” model: life immunity (mumps)
Background “SIR” model: life immunity (mumps) Each node in the graph is in one of three states Susceptible (i.e. healthy) Infected Removed (i.e. can’t get infected again) Prob. δ Prob. β t = 1 t = 2 t = 3 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

55 Terminology: continued
Background Terminology: continued Other virus propagation models (“VPM”) SIS : susceptible-infected-susceptible, flu-like SIRS : temporary immunity, like pertussis SEIR : mumps-like, with virus incubation (E = Exposed) ….…………. Underlying contact-network – ‘who-can-infect-whom’ Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

56 Faloutsos, Prakash, Chau, Koutra, Akoglu
Background Related Work All are about either: Structured topologies (cliques, block-diagonals, hierarchies, random) Specific virus propagation models Static graphs R. M. Anderson and R. M. May. Infectious Diseases of Humans. Oxford University Press, 1991. A. Barrat, M. Barthélemy, and A. Vespignani. Dynamical Processes on Complex Networks. Cambridge University Press, 2010. F. M. Bass. A new product growth for model consumer durables. Management Science, 15(5):215–227, 1969. D. Chakrabarti, Y. Wang, C. Wang, J. Leskovec, and C. Faloutsos. Epidemic thresholds in real networks. ACM TISSEC, 10(4), 2008. D. Easley and J. Kleinberg. Networks, Crowds, and Markets: Reasoning About a Highly Connected World. Cambridge University Press, 2010. A. Ganesh, L. Massoulie, and D. Towsley. The effect of network topology in spread of epidemics. IEEE INFOCOM, 2005. Y. Hayashi, M. Minoura, and J. Matsukubo. Recoverable prevalence in growing scale-free networks and the effective immunization. arXiv:cond-at/ v2, Aug H. W. Hethcote. The mathematics of infectious diseases. SIAM Review, 42, 2000. H. W. Hethcote and J. A. Yorke. Gonorrhea transmission dynamics and control. Springer Lecture Notes in Biomathematics, 46, 1984. J. O. Kephart and S. R. White. Directed-graph epidemiological models of computer viruses. IEEE Computer Society Symposium on Research in Security and Privacy, 1991. J. O. Kephart and S. R. White. Measuring and modeling computer virus prevalence. IEEE Computer Society Symposium on Research in Security and Privacy, 1993. R. Pastor-Santorras and A. Vespignani. Epidemic spreading in scale-free networks. Physical Review Letters 86, 14, 2001. ……… Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

57 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

58 How should the answer look like?
Answer should depend on: Graph Virus Propagation Model (VPM) But how?? Graph – average degree? max. degree? diameter? VPM – which parameters? How to combine – linear? quadratic? exponential? ….. Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

59 Static Graphs: Our Main Result
Informally, w/ Deepay Chakrabarti For, any arbitrary topology (adjacency matrix A) any virus propagation model (VPM) in standard literature the epidemic threshold depends only on the λ, first eigenvalue of A, and some constant , determined by the virus propagation model λ No epidemic if λ * < 1 In Prakash+ ICDM 2011 (Selected among best papers). Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

60 Our thresholds for some models
s = effective strength s < 1 : below threshold Models Effective Strength (s) Threshold (tipping point) SIS, SIR, SIRS, SEIR s = λ . s = 1 SIV, SEIV (H.I.V.) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

61 Our result: Intuition for λ
“Official” definition: “Un-official” Intuition  Let A be the adjacency matrix. Then λ is the root with the largest magnitude of the characteristic polynomial of A [det(A – xI)]. Doesn’t give much intuition! λ ~ # paths in the graph u u ≈ . (i, j) = # of paths i  j of length k Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

62 Largest Eigenvalue (λ)
better connectivity higher λ λ ≈ 2 λ = N λ = N-1 λ ≈ 2 λ= 31.67 λ= 999 N = 1000 N nodes Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

63 Examples: Simulations – SIR (mumps)
Fraction of Infections Footprint (a) Infection profile (b) “Take-off” plot PORTLAND graph: synthetic population, 31 million links, 6 million nodes Effective Strength Time ticks Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

64 Examples: Simulations – SIRS (pertusis)
Fraction of Infections Footprint (a) Infection profile (b) “Take-off” plot PORTLAND graph: synthetic population, 31 million links, 6 million nodes Time ticks Effective Strength Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

65 Faloutsos, Prakash, Chau, Koutra, Akoglu
See paper for full proof General VPM structure Model-based λ * < 1 Dimensional arguments… Graph-based Topology and stability Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

66 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

67 Faloutsos, Prakash, Chau, Koutra, Akoglu
See paper for full proof General VPM structure Model-based λ * < 1 Graph-based Topology and stability Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

68 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

69 Dynamic Graphs: Epidemic?
Alternating behaviors DAY (e.g., work) adjacency matrix 8 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

70 Dynamic Graphs: Epidemic?
Alternating behaviors NIGHT (e.g., home) adjacency matrix 8 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

71 Faloutsos, Prakash, Chau, Koutra, Akoglu
Model Description Infected Healthy X N1 N3 N2 Prob. β Prob. δ SIS model recovery rate δ infection rate β Set of T arbitrary graphs day N night N , weekend….. Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

72 Our result: Dynamic Graphs Threshold
Informally, NO epidemic if eig (S) = < 1 Single number! Largest eigenvalue of The system matrix S Details S = In Prakash+, ECML-PKDD 2010 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

73 Faloutsos, Prakash, Chau, Koutra, Akoglu
Infection-profile log(fraction infected) Synthetic MIT Reality Mining ABOVE ABOVE AT AT BELOW BELOW Time Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

74 Faloutsos, Prakash, Chau, Koutra, Akoglu
“Take-off” plots Footprint (# “steady state”) Synthetic MIT Reality EPIDEMIC Our threshold Our threshold EPIDEMIC NO EPIDEMIC NO EPIDEMIC (log scale) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

75 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Background Result (Static Graphs) Proof Ideas (Static Graphs) Bonus 1: Dynamic Graphs Bonus 2: Competing Viruses Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

76 Faloutsos, Prakash, Chau, Koutra, Akoglu
Competing Contagions iPhone v Android Blu-ray v HD-DVD Biological common flu/avian flu, pneumococcal inf etc Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

77 Faloutsos, Prakash, Chau, Koutra, Akoglu
Details A simple model Modified flu-like Mutual Immunity (“pick one of the two”) Susceptible-Infected1-Infected2-Susceptible Virus 1 Virus 2 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

78 Question: What happens in the end?
green: virus 1 red: virus 2 Number of Infections Steady State = ? ASSUME: Virus 1 is stronger than Virus 2 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

79 Question: What happens in the end?
Steady State green: virus 1 red: virus 2 Number of Infections Strength ?? = 2 Strength ASSUME: Virus 1 is stronger than Virus 2 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

80 Answer: Winner-Takes-All
green: virus 1 red: virus 2 Number of Infections ASSUME: Virus 1 is stronger than Virus 2 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

81 Our Result: Winner-Takes-All
Given our model, and any graph, the weaker virus always dies-out completely Details The stronger survives only if it is above threshold Virus 1 is stronger than Virus 2, if: strength(Virus 1) > strength(Virus 2) Strength(Virus) = λ β / δ  same as before! In Prakash, Beutel, + WWW 2012 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

82 Faloutsos, Prakash, Chau, Koutra, Akoglu
Real Examples [Google Search Trends data] Reddit v Digg Blu-Ray v HD-DVD Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

83 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

84 Full Static Immunization
Given: a graph A, virus prop. model and budget k; Find: k ‘best’ nodes for immunization (removal). ? ? k = 2 ? ? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

85 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) Full Immunization (Static Graphs) Fractional Immunization Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

86 Faloutsos, Prakash, Chau, Koutra, Akoglu
Challenges Given a graph A, budget k, Q1 (Metric) How to measure the ‘shield-value’ for a set of nodes (S)? Q2 (Algorithm) How to find a set of k nodes with highest ‘shield-value’? Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

87 Proposed vulnerability measure λ
λ is the epidemic threshold “Safe” “Vulnerable” “Deadly” Increasing λ Increasing vulnerability Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

88 A1: “Eigen-Drop”: an ideal shield value
Eigen-Drop(S) Δ λ = λ - λs 9 9 Δ 9 11 10 10 1 1 6 2 4 4 8 8 2 3 7 3 7 5 5 6 Original Graph Without {2, 6} Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

89 (Q2) - Direct Algorithm too expensive!
Immunize k nodes which maximize Δ λ S = argmax Δ λ Combinatorial! Complexity: Example: 1,000 nodes, with 10,000 edges It takes 0.01 seconds to compute λ It takes 2,615 years to find 5-best nodes! Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

90 Faloutsos, Prakash, Chau, Koutra, Akoglu
A2: Our Solution Part 1: Shield Value Carefully approximate Eigen-drop (Δ λ) Matrix perturbation theory Part 2: Algorithm Greedily pick best node at each step Near-optimal due to submodularity NetShield (linear complexity) O(nk2+m) n = # nodes; m = # edges In Tong, Prakash+ ICDM 2010 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

91 Experiment: Immunization quality
Log(fraction of infected nodes) PageRank Betweeness (shortest path) Degree Lower is better Acquaintance Eigs (=HITS) NetShield Time Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

92 Faloutsos, Prakash, Chau, Koutra, Akoglu
Outline Motivation Epidemics: what happens? (Theory) Action: Who to immunize? (Algorithms) Full Immunization (Static Graphs) Fractional Immunization Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

93 Faloutsos, Prakash, Chau, Koutra, Akoglu
Fractional Immunization of Networks B. Aditya Prakash, Lada Adamic, Theodore Iwashyna (M.D.), Hanghang Tong, Christos Faloutsos Under review Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

94 Fractional Asymmetric Immunization
Drug-resistant Bacteria (like XDR-TB) Hospital Another Hospital Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

95 Fractional Asymmetric Immunization
Drug-resistant Bacteria (like XDR-TB) Hospital Another Hospital Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

96 Fractional Asymmetric Immunization
Problem: Given k units of disinfectant, how to distribute them to maximize hospitals saved? Hospital Another Hospital Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

97 Our Algorithm “SMART-ALLOC”
~6x fewer! [US-MEDICARE NETWORK 2005] Each circle is a hospital, ~3000 hospitals More than 30,000 patients transferred CURRENT PRACTICE SMART-ALLOC Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

98 Faloutsos, Prakash, Chau, Koutra, Akoglu
Running Time Wall-Clock Time > 1 week > 30,000x speed-up! Lower is better 14 secs Simulations SMART-ALLOC Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

99 Faloutsos, Prakash, Chau, Koutra, Akoglu
Experiments Lower is better SECOND-LIFE PENN-NETWORK ~5 x ~2.5 x K = 200 K = 2000 Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

100 Faloutsos, Prakash, Chau, Koutra, Akoglu
Acknowledgements Funding Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

101 Faloutsos, Prakash, Chau, Koutra, Akoglu
References Threshold Conditions for Arbitrary Cascade Models on Arbitrary Networks (B. Aditya Prakash, Deepayan Chakrabarti, Michalis Faloutsos, Nicholas Valler, Christos Faloutsos) - In IEEE ICDM 2011, Vancouver (Invited to KAIS Journal Best Papers of ICDM.) Virus Propagation on Time-Varying Networks: Theory and Immunization Algorithms (B. Aditya Prakash, Hanghang Tong, Nicholas Valler, Michalis Faloutsos and Christos Faloutsos) – In ECML-PKDD 2010, Barcelona, Spain Epidemic Spreading on Mobile Ad Hoc Networks: Determining the Tipping Point (Nicholas Valler, B. Aditya Prakash, Hanghang Tong, Michalis Faloutsos and Christos Faloutsos) – In IEEE NETWORKING 2011, Valencia, Spain Winner-takes-all: Competing Viruses or Ideas on fair-play networks (B. Aditya Prakash, Alex Beutel, Roni Rosenfeld, Christos Faloutsos) – In WWW 2012, Lyon On the Vulnerability of Large Graphs (Hanghang Tong, B. Aditya Prakash, Tina Eliassi- Rad and Christos Faloutsos) – In IEEE ICDM 2010, Sydney, Australia Fractional Immunization of Networks (B. Aditya Prakash, Lada Adamic, Theodore Iwashyna, Hanghang Tong, Christos Faloutsos) - Under Submission Rise and Fall Patterns of Information Diffusion: Model and Implications (Yasuko Matsubara, Yasushi Sakurai, B. Aditya Prakash, Lei Li, Christos Faloutsos) - Under Submission Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu

102 Propagation on Large Networks
B. Aditya Prakash Christos Faloutsos Analysis Policy/Action Data Taiwan'12 Faloutsos, Prakash, Chau, Koutra, Akoglu


Download ppt "Anomaly Detection and Virus Propagation in Large Graphs"

Similar presentations


Ads by Google