Presentation is loading. Please wait.

Presentation is loading. Please wait.

PEER TO PEER BOTNET DETECTION FOR CYBER- SECURITY (DEFENSIVE OPERATION): A DATA MINING APPROACH Masud, M. M. 1, Gao, J. 2, Khan, L. 1, Han, J. 2, Thuraisingham,

Similar presentations


Presentation on theme: "PEER TO PEER BOTNET DETECTION FOR CYBER- SECURITY (DEFENSIVE OPERATION): A DATA MINING APPROACH Masud, M. M. 1, Gao, J. 2, Khan, L. 1, Han, J. 2, Thuraisingham,"— Presentation transcript:

1 PEER TO PEER BOTNET DETECTION FOR CYBER- SECURITY (DEFENSIVE OPERATION): A DATA MINING APPROACH Masud, M. M. 1, Gao, J. 2, Khan, L. 1, Han, J. 2, Thuraisingham, B 1 1 University of Texas at Dallas 2 University of Illinois at Urbana Champaign 1

2 Botnet ◦ Network of compromised machines ◦ Under the control of a botmaster Taxonomy: ◦ C&C : Centralized, Distributed etc. ◦ Protocol: IRC, HTTP, P2P etc. ◦ Rallying mechanism: Hard-coded IP, Dynamic DNS etc. Network traffic monitoring Background 2 Botnet

3 Monitor Payload / Header? Problems with payload monitoring ◦ Privacy ◦ Unavailability ◦ Encryption/Obfuscation Information extracted from Header (features) ◦ New connection rate ◦ Packet size ◦ Upload/Download bandwidth ◦ Arp request & ICMP echo reply rate What To Monitor? 3 Botnet detection

4 Stream data : Stream data refers to any continuous flow of data. ◦ For example: network traffic / sensor data. Properties of stream data : Stream data has two important properties: infinite length & concept drift Stream data classification: Cannot be done with conventional classification algorithms We propose a multi-chunk multi-level ensemble approach to solve these problems, ◦ which significantly reduces error over the single- chunk single-level ensemble approaches. Mapping to Stream Data Mining 4 Stream Data

5 The Single-Chunk Single-Level Ensemble (SCE) Approach Divide the data stream into equal sized chunks ◦ Train a classifier from each data chunk ◦ Keep the best K such classifier-ensemble ◦ Select best K classifiers from {c 1,…c k } U {c k+1 } 5 Stream Data Classification D1D1 D2D2 D3D3 …DkDk c1c1 D k+1 c2c2 c3c3 ckck c k+1

6 Our Approach: Multi-Chunk Multi- Level Ensemble (MCE) ◦ Train v classifiers from r consecutive data chunks, and create an ensemble, and Keep the best K such ensembles ◦ Two-level ensemble hierarchy:  Top level (A): ensemble of K middle level ensembles Ai  Middle level (A i ): ensemble of v bottom level classifiers A i(j) 6 MCE approach A A1A1 AKAK {{ A 1(1) A 1(v) A K(1) A K(V) Top level ensemble Middle level ensembles Bottom level classifiers

7 Middle-level Ensemble Construction 7 MCE approach

8 Top Level Ensemble Updating Let D n be the most recent labeled data chunk Let A be the top-level ensemble Construct a middle-level ensemble A` ◦ using r consecutive data chunks: D={D n-r+1,…,D n } Obtain error of A` on D by testing each classifier A` (j) on its corresponding test data d j Obtain error of each middle level ensemble A 1,…A k on the latest chunk D n A  K lowest error middle level ensembles in classifiers in A U {A`} 8 MCE approach

9 Error Reduction Analysis 9 MCE approach Proof:

10 Error Reduction Analysis (continued) 10 MCE approach Proof:

11 Evaluation 11 MCE approach Results on synthetic data Results on botnet data

12 Offensive Operation Masud, M. M., Mohan, V., Khan, L., Hamlen, K., and Thuraisingham, B

13 Overview Goal ◦ To hack/attack other person’s computer and steal sensitive information ◦ Without having been detected Idea ◦ Propagate malware (worm/spyware etc.) through network ◦ Apply obfuscation so that malware detectors fail to detect the malware Assumption ◦ The attacker has the malware detector (valid assumption because anti-virus software are public)

14 Strategy Steps: ◦ Extract the model from the malware detector ◦ Obfuscate the malware to evade the model ◦ There have been some works on automatic model extraction from malware detector, such as: Christodorescu and Jha. Testing Malware Detectors. In Proc. 2004 ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2004). Malware detector Model Model extraction Malware Analysis Obfuscation /refinement

15 Example Suppose the malware detector is a data mining based malicious code detector such as [a]. ◦ Assume that the model is a decision tree as follows: ◦ Given this model,  if malware x has pattern pt 1 then it will be detected as benign  must insert the pattern pt 1 into the malware - insertion  If malware x doesn’t have pt 1 and pt 2 then it will be detected as benign  must remove pt 2 from the malware (assuming it doesn’t have pt 1 ) - removal [a] Masud, M. M., Khan, L. & Thuraisingham, B. A Scalable Multi-level Feature Extraction Technique to Detect Malicious Executables. Information System Frontiers, 10:33-35, 2008. Pt 1 Pt 2 +- - does not have has does not have


Download ppt "PEER TO PEER BOTNET DETECTION FOR CYBER- SECURITY (DEFENSIVE OPERATION): A DATA MINING APPROACH Masud, M. M. 1, Gao, J. 2, Khan, L. 1, Han, J. 2, Thuraisingham,"

Similar presentations


Ads by Google