Download presentation

Presentation is loading. Please wait.

Published byJohnathon Buffkin Modified about 1 year ago

1
Modeling Malware Spreading Dynamics Michele Garetto (Politecnico di Torino – Italy) Weibo Gong (University of Massachusetts – Amherst – MA) Don Towsley (University of Massachusetts – Amherst – MA) INFOCOM 2003

2
2 Outline Motivation Modeling framework Interactive Markov Chains Analysis and simulation of malware propagation dynamics Percolation problem Transient behavior Conclusions

3
3 Motivation The Internet is an easy and powerful mechanism for propagating malicious software programs : the “ malware ” (email viruses, worms, …) It is expected that future malware acitivity will be more prevalent and virulent, resulting in significant greater damage and economic losses

4
4 Motivation Dynamics of malware propagation are still not well understood We would like to: Predict the temporal evolution of an infection process that starts propagating on a network Design and evaluate effective countermeasures Assess the defensibility and vulnerability of different network architectures Need to develop mathematical methodologies that are able capture the spreading characteristics of malware

5
5 Our contribution A flexible modeling framework based on Interactive Markov Chains, able to capture the probabilistic nature of malware propagation Application of such framework to the case of email viruses: Identification of a “percolation problem” Investigation of the impact of the underlying network topology Analytical bounds and approximations validated through extensive simulations

6
6 The “Interactive Markov Chain” (IMC) Modeling Framework Global network structure... alert failednormal Local Structure (can vary from node to node) secureinsecureemergency Each node is represented by a Markov chain, whose state transitions are influenced by the status of its neighbors but locally a Markov chain Global Structure (the network)

7
7 Computational complexity issue The whole system evolves according to a global Markov chain G, whose state space dimension ( #G ) is equal to the product of the local chain dimensions ( #L ) G = L N The solution of the global Markov chain is feasible only for small systems example: - 20 nodes - binary status (0 = not infected, 1 = infected) 2 20 states !

8
8 How can we study very large systems (thousands of nodes) ? Discrete event simulations of the model how many runs ? how long ? could be computationally too expensive do not help to understand the system dynamics Analytical bounds and approximations quick prediction of the system behavior gross-level approximations can be sufficient provide insights into the inner dynamics

9
9 The virus propagates as attachment to e-mail messages Requires human assistance random time elapses before the recipient reads the message the “click” probability The virus makes use of the recipient’s address book to send copies of itself E-mail virus propagation

10
10 We consider the network graph induced by email address books Each node stands for an email address Edges represent social or business relationships The resulting graph is expected to have “small world” properties: small characteristic path length high clustering coefficient IMC model : global structure

11
11 IMC model : local structure 3 statuses for each node: S (Susceptible): the node can be infected by the virus I (Infected): the node has been infected by the virus M (Immune): the node can no more be infected by the virus Discrete time model probability that node “j” is susceptible at time k probability that node “j” is infected at time k probability that node “j” is immune at time k

12
12 IMC model i j w ij IM S cjcj 1-c j

13
13 Virus propagation model The numerical solution of the system requires to know the joint probabilities of neighboring nodes: ?

14
14 Fundamental questions about malware propagation dynamics: Virus propagation model What is the final size of the infection outbreak ? How many nodes (on average) will be reached by the virus at the end ? How fast is the malware propagation ? What is the (average) number of infected node as a function of time ?

15
15 The “small-world” model of Watts and Strogatz A ring lattice with additional random shortcuts Parameters: N = number of nodes S = number of shortcuts ( = shortcuts density) k = lattice connectivity (number of neighbors on each side of a node) N = 24 S = 4 k = 3

16
16 What is the final size of the infection outbreak ? Not all of the susceptible nodes necessarily receive a copy of the virus ! site percolation problem (node occupation probability = click probability)

17
17 Site percolation problem on the small world graph: exact asymptotic result [Moore, Newman 2000]

18
18 Transient analysis of an infection process : Bounds Probability that a node has been reached by the virus Lower bound property of joint probabilities Upper bound theory of Associated Random Variables

19
19 Analytical bounds 0 1000 2000 3000 4000 5000 6000 7000 8000 05001000150020002500 Average number of infected nodes time sim upper bound lower bound Infinite unidimentional lattice k = 1 k = 10 k = 100 (click probability = 1) 1

20
20 Transient analysis of an infection process : approximation Linear mixing of lower bound and upper bound k = local connectivity of the node s = self-influence probability

21
21 Fitting of mixing coefficient M(k,s) Infinite unidimentional lattice

22
22 Approximate analysis on the small world graph: the impact of topology 0 200 400 600 800 1000 1200 1400 1600 1800 2000 02004006008001000 Average number of infected nodes time model approx simulation k = 10 no shortcuts k = 10 2 shortcuts k ~ geom(10) no shortcuts k = 10 20 shortcuts Fully-connected graph (2000 nodes - click probability = 1)

23
23 Combining transient analysis and percolation on general topologies Probability not to be reached by the virus = initial immunization overestimate of the spreading rate of the virus Upper bound of the reaching probability on general topologies: Global upper bound for the infection process

24
24 0 1000 2000 3000 4000 5000 6000 05001000150020002500300035004000 Average number of infected nodes time simulation model approx + bound perc Transient analysis and percolation on power-law random graphs 10000 nodes GLP algorithm (Bu 2002) power-law node degree, small-world properties (click probability = 0.5) m = initial connectivity m = 1 m = 2 m = 4 One initially infected node with degree 10

25
25 Conclusions We have proposed an analytical framework to study the dynamics of malware propagation on a network We have obtained useful bounds and approximations to study an infection process on a general topology Approach suitable to analyze a wide range of “dynamic interactions on networks” (routing protocols, p2p,…)

26
26 The End Thanks…

27
27 Site percolation on a given small-world graph 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0100200300400500600700800900999 Reaching probability node index sim upper bound - h = 0 upper bound - h = 8 approximation lower bound - h = 8 lower bound - h = 0

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google