Download presentation
Presentation is loading. Please wait.
Published byDomenic Solomon Hawkins Modified over 9 years ago
1
Worm Origin Identification Using Random Moonwalks Yinglian Xie, V. Sekar, D. A. Maltz, M. K. Reiter, Hui Zhang 2005 IEEE Symposium on Security and Privacy Presented by: Anup Goyal Edward Merchant
2
2 Outline Motivation/Introduction Problem Formulation The Random Moonwalk Algorithm Evaluation Methodology Analytical Model Real Trace Study Simulation Study Deployment and Future Work
3
3 Outline Motivation/Introduction Problem Formulation The Random Moonwalk Algorithm Evaluation Methodology Analytical Model Real Trace Study Simulation Study Deployment and Future Work
4
4 Motivation Little automated support for identifying the location from which an attack is launched. Knowledge of the origin support law enforcement. Knowledge of the casual flow that advance attack supports diagnosis of how network defense is breached.
5
5 Introduction We craft an algorithm that determines the origin of epidemic spreading attacks. identify the “ patient zero ” of the epidemic reconstruct the sequence of spreading
6
6 Introduction (cont ’ d) Random moonwalk algorithm - Find the origin and propagation paths of a worm attack. performs post-mortem analysis on the traffic records logged by the network. It depends on the assumption that worm propagation occurs in a tree-like structure.
7
7 Outline Introduction Problem Formulation The Random Moonwalk Algorithm Evaluation Methodology Analytical Model Real Trace Study Simulation Study Deployment and Future Work
8
8 Problem Formulation
9
9 Problem Formulation (cont ’ d) G = (V, E) A directed host contact graph G = (V, E) V = H × T H H is the set of all hosts in the network T T is time Each directed edge represents a network flow between two end hosts at certain time. flow has a finite duration, and involves transfer of one or more packets. e = (u, v, t s, t e ) e = (u, v, t s, t e )
10
10 Problem Formulation (cont ’ d) normal edge The flow does not carry an infectious payload. attack edge The flow carries attack traffic, whether or not the flow is successful. causal edge The flow that actually infect its destination. Goal - Identify a set of edges that are edges from the top level of the casual tree.
11
11 Outline Introduction Problem Formulation The Random Moonwalk Algorithm Evaluation Methodology Analytical Model Real Trace Study Simulation Study Deployment and Future Work
12
12 Random Moonwalk Algo. Causal relationship between flows by exploiting the global structure of worm attacks No use of attack content, attack packet size, or port numbers For attack progress, there has to be a communication link between source of the attack and compromised nodes This infection causing communication flows form a causal tree, rooted at the source of attack. Find the tree and root is the source of attack Find causal flows and attack flows
13
13 Random Moonwalk Algo. Basic Algorithm Go backward from every node for certain distance. At each node choose only the flows which are within certain time limit Do it Z number of times Find the edges with highest frequency Create a tree for these flows Most probably this is the causal tree and root is the source of attack
14
14 Random Moonwalk Algo. (cont ’ d) Sampling process controlled by three parameters W – the number of walks (samples) performed. D – maximum length of the path traversed. Δt - Δt - sampling window size, max. time allowed between two consecutive edges
15
15 Random Moonwalk Algo. (cont ’ d) Why this algorithm works ? To propagate, sometime after infection, worm creates a new flows to other hosts. This forms a link from source to last victim Traverse this link backward and find the source An infected host generally originates more flows than it receives. Δt The originators host contact graph are mostly clients. Normal edges have no predecessor within Δt.
16
16 Outline Introduction Problem Formulation The Random Moonwalk Algorithm Evaluation Methodology Analytical Model Real Trace Study Simulation Study Deployment and Future Work
17
17 Outline Evaluation Methodology Analytical Model Assumptions Edge Probability Distribution False Positives and False Negatives Parameter Selection Real Trace Study Simulation Study
18
18 Analytical Model (Assumptions) The host contact graph is known. |E||H| |E| edges and |H| hosts Discretize time into units. Every flow has a length of one unit and fits into one unit.
19
19 Analytical Model (Probability)
20
20 Analytical Model (FP & FN) (42 malicious edges at k = 1.)(Total 10 5 host.)
21
21 Outline Evaluation Methodology Analytical Model Real Trace Study Detect the Existence of an Attack Identify Casual Edges & Initial Infected Host Reconstruct the Top Level Casual Tree Parameter Selection Performance Simulation Study
22
22 Real Trace Study Background Traffic Traffic trace was collected over a 4 hour period at backbone of a class-B university network. collect intra-campus flows only (1.4 million) involving 8040 hosts Addition Add flow records to represent worm-like traffic with vary scanning rate randomly select the vulnerable hosts.
23
23 Real Trace Study (Existence)
24
24 Real Trace Study (Identify) (800 causal edges from 1.5*10 6 flows) (The scanning rate of Trace-50 is less than Trace-10.)
25
25 Real Trace Study (Identify) Top frequent sampling v.s. Actual initial edges (total 800 causal edges, initial 10 % are the first 80 edges) (The scanning rate of Teace-50 is less than Trace-10.)
26
26 Top 60, Trace-50, 10 4 walks Blaster Worm scan Original Attacker
27
27 Real Trace Study (Parameter) dΔt d and Δt d = infinite
28
28 Real Trace Study (Performance) Random moonwalk Z = 100, 10 4 walks Heavy-hitter Find 800 hosts with largest number of flows in the trace, random pick 100 flows Super-spreader Find 800 hosts contacted the largest number of destination, randomly pick 100 flows Oracle With zero false positive rate, randomly select 100 flows between infected hosts
29
29 Real Trace Study (Performance)
30
30 Real Trace Study (Performance) Scanning Method R↑ Smart worm (always scan valid hosts), R↑ Scan with random address C: casual edge A: attack edge 100: Z=100 500: Z=500
31
31 Outline Evaluation Methodology Analytical Model Real Trace Study Simulation Study
32
32 Simulate different background traffic Realistic host contact graphs tend to be much sparser, meaning the chance of communication between two arbitrary hosts is very low. Simulation Study p.s. in campus network, the accuracy is about 0.7
33
33 Outline Introduction Problem Formulation The Random Moonwalk Algorithm Evaluation Methodology Analytical Model Real Trace Study Simulation Study Deployment and Future Work
34
34 Deployment and Future Work This approach assumes that the availability of complete data. the missing data on performance the deployment of the algorithm
35
35 Questions ???? Thank You
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.