Presentation is loading. Please wait.

Presentation is loading. Please wait.

BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.

Similar presentations


Presentation on theme: "BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010."— Presentation transcript:

1 BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010

2 Reference Wei Lu, Mahbod Tavallaee, Goaletsa Rammidi, Ali A. Ghorbani, "BotCop: An Online Botnet Traffic Classifier," cnsr, pp.70-77, 2009 Seventh Annual Communication Networks and Services Research Conference, 2009 22016/1/11

3 Outline Introduction Traffic classification Botnet detection Experimental evaluation Conclusions 32016/1/11

4 Introduction Honeypots: To capture malware, understand the basic behavior of botnets, and create bot binaries or botnet signatures. Based on the existing botnets and provides no solution for the new botnets. Automatically detect the botnets: ◦ (1) passive anomaly analysis. ◦ (2) traffic classification. 2016/1/114

5 Hierarchical Framework In the higher level all unknown network traffic are labeled and classified into different network application communities. ◦ P2P, HTTP Web, Chat, DataTransfer, Online Games, Mail Communication, Multimedia(streaming and VoIP) and Remote Access. In the lower level focusing on each application community, we investigate and apply the temporal- frequent characteristics of network flows to differentiate the malicious botnet behavior from the normal application traffic. 2016/1/115

6 Traffic Classification We first model and generate signatures for more than 470 applications according to port numbers and protocol specifications of these applications. Second, concentrating on unknown flows that cannot be identified by signatures, we investigate their temporal-frequent characteristics in order to differentiate them into the already labeled applications based on a decision tree. Fred-eZone, a free WiFi for Fredericton, Canada. 2016/1/116

7 Signatures Based Classifier For most applications, their initial protocol handshake steps are usually different and thus can be used for classification. 2016/1/117

8 Decision Tree Based Classifier A general result is that about 40% flows cannot be classified by the current payload signatures based classification method. Extend n-gram frequency into a temporal domain. Generate a set of 256-dimentional vector representing the temporal-frequent characteristics of the 256 ASCII binary bytes on the payload over a predefined time interval. The n-gram (i.e. n = 1 in particular) over a one second time interval for both source flow payload and destination flow 2016/1/118

9 9 Temporal-frequent metric for source flow payload of LimeWire application. Temporal-frequent metric for source flow payload of BitTorrent application.

10 2016/1/1110 Temporal-frequent metric for source flow payload of HTTPWeb application. Temporal-frequent metric for source flow payload of SecureWeb application.

11 Profiling Applications We denote the 256-dimensional n-gram byte distribution as a vector. : The frequency of the ASCII character on the flow payload over a time window. Given n historical known flows for each specific application, we define a n× 256 matrix,, for profiling applications, 2016/1/1111

12 A Typical Decision Tree 2016/1/1112

13 Botnet Detection Botnets behavior: ◦ Response time. ◦ Synchronized. 2016/1/1113

14 Botnet Detection Approach A set of N data objects, where. Initialization: each cluster contains only one data instance. Repeat: find the closest pair of clusters and then merge them into a single cluster. Until: clusters number = 2 2016/1/1114

15 Experimental evaluation The botnet traffic is collected on a honeypot deployed on a real network, aggregated them into 243 flows. Traffic trace collected over 2 days are used for training and the realtime traffic flows collect on the 3rd day are used for testing. The size of input data for training decision tree is 11000× 256. 11 typical applications belonging to 8 typical application groups. 2016/1/1115

16 Applications in training dataset 2016/1/1116

17 Distribution of "unknown" application flows More than 90,000 flows are collected over the testing day and been identified as unknown. 2016/1/1117

18 Source Flow Based Decision Tree Classifier 2016/1/1118 Total number of flows correctly indentified: 82983 89.4%

19 Destination Flow Based Decision Tree Classifier 2016/1/1119 Total number of flows correctly indentified: 85995 92.6%

20 IRC Application Communities 2016/1/1120

21 Conclusions Unknown applications on the current network are firstly classified into different application communities. Then focusing on each application community. A temporal-frequent characteristic. How to evaluate the approach on the P2P community and measure its performance on P2P based botnets? 2016/1/1121


Download ppt "BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010."

Similar presentations


Ads by Google