Presentation is loading. Please wait.

Presentation is loading. Please wait.

Byung-Joon Lee and Youngseok Lee

Similar presentations


Presentation on theme: "Byung-Joon Lee and Youngseok Lee"— Presentation transcript:

1 An Automatic Signature-generating Method for Web-based P2P Applications
Byung-Joon Lee and Youngseok Lee Dept. of Computer Science and Engineering, Chungnam National University 220 Gungdong, Yusonggu, Daejon, Korea GlobeCom’06(Underreview)

2 Outline Introduction Related Work Defining Web-Based P2P Application
Architecture of GENESIS Experiment Conclusion 2019/5/3

3 Introduction In enterprise or campus networks, well-known TCP/UDP port numbers used by P2P applications are blocked to prevent P2P traffic from consuming the whole bandwidth of the main link for the network--not proactively quarantine. P2P applications are evolving either by employing a firewall-avoiding method, called port hopping, or by changing themselves into brand new types of P2P applications. 2019/5/3

4 Introduction (cont.) The process of finding signatures for new P2P applications is a significantly difficult and time-consuming job. An automatic signature generating algorithm is essential for effective traffic classification with signatures. 2019/5/3

5 suspected 2019/5/3

6 Introduction (cont.) Propose an automatic signature generating method to find popular web-based P2P applications, which is called GENESIS (System for GENErating SIgnatureS). The method captures P2P-suspected traffic flows from the raw traffic dump, extracts signatures automatically from those categorized flows. 2019/5/3

7 Related Work Most systems designed to find out signatures automatically are for Internet worms. Follow some common procedures: collect flows which are suspected to be generated by Internet worms. split payloads into multiple blocks of variable size. evaluate the ‘prevalence’ of those blocks. The reason of the procedure (2) is that the signature may appear at variable locations of the payload if a worm shows a polymorphic behavior [3], which means the worm can encode and re-encode itself into successive and different byte strings. 2019/5/3

8 Related Work (cont.) mark blocks with the high prevalence as candidate signatures. apply the address dispersion criteria. Most of the procedures can be applied in a similar way to the case of P2P applications. But P2P applications show different behaviors compared with those of Internet worms. 2019/5/3

9 Related Work (cont.) P2P applications follow different communication patterns, and have no polymorphic behavior. procedure (1) should be greatly modified. procedure (2) should be omitted. 2019/5/3

10 Defining Web-Based P2P Application
2019/5/3

11 Defining Web-Based P2P Application (cont.)
Four rules are defined to classify flows into four different categories (WC, WS, DC, DS). 2019/5/3

12 Defining Web-Based P2P Application (cont.)
2019/5/3

13 Architecture of GENESIS
GENESIS is a part of the Wise<TrafView> system that provides the content-aware traffic monitoring function. The Wise<TrafView> Capturing Agent collects packets from network interface cards, assembling packets to flow records, and saving them into files. GENESIS inspects the flow files, distinguishes and saves P2P-suspected flows, and extracts P2P signatures. 2019/5/3

14 Architecture of GENESIS (cont.)
GENESIS consists of two different parts: FlowFinder:finds P2P-suspected flows and saves them in different files according to the classification rules. SignatureFinder:carries out signature extraction from each saved file. 2019/5/3

15 Architecture of GENESIS (cont.)
A. FlowFinder and flow classification process Found flows using the rules listed in Table I are saved under each directory named by the IP address of the P2P portal. The saved files have extensions of *.genesis. The names of those files are determined by the categories of the saved flows (WC, WS, DC, DS). 2019/5/3

16 Architecture of GENESIS (cont.)
B. SignatureFinder and signature extraction process Operates in two phases on each *.genesis file Records statistics about each byte of the payloads for all packets in the flow files. generates signatures using the statistics. 2019/5/3

17 files campus <genesis> FlowFinder (rules) …… …… ……
Wise <trafview> files campus <genesis> Directory1(IP1) Directory2(IP2) …… DirectoryN(IPN) FlowFinder (rules) ds.genesis wc.genesis ws.genesis …… flow1 flow2 …… flow n packet 1 …… packet 2 packet n SignatureFinder payload …… 63

18 Architecture of GENESIS (cont.)
Assumed two parts inspection of first B (default = 64) bytes of the payload would be enough. inspection of first P (default = 10) packets of a flow would be enough. 2019/5/3

19 Architecture of GENESIS (cont.)
The first part Verify:examined the frequency of 1-byte integer value of every packet payload without TCP/IP header fields. Built a 2-dimensional array where an (i,j) element represents the frequency of the 1-byte integer value, j (0≤j≤255), at the i-th byte position of payloads. 2019/5/3

20 Architecture of GENESIS (cont.)
Using the frequency table, the maximum, minimum, and average values for the frequencies of 1-byte integer values were calculated for each position of payloads. 2019/5/3

21 Architecture of GENESIS (cont.)
The probability of signature existence is examined with the following function--PSE(i). PSE(i) = ( max(i)-avg(i) ) / max(i) – avg(i) / max(i) (1) PSE(i) value is within the range of [-1,1]. max(i) Min(i) Avg(i) the frequencies of the 1-byte integer values at the i-th byte position of payloads. 2019/5/3

22 Architecture of GENESIS (cont.)
2019/5/3

23 Architecture of GENESIS (cont.)
The second part Averaged the number of packets for data flows. 2019/5/3

24 Architecture of GENESIS (cont.)
Algorithm 1:Phase 1 of SignatureFinder 在i-th byte中有j值的封包總數量 在i-th byte中有j值的flow大小 2019/5/3

25 packet 1 2 …… ………… ………… …… . 1 …… . 1 …………………… …………………… packet_counter
3 4 62 2 …… 1 2 ………… ………… 63 1 2 63 0x00 …… . 1 0x00 …… . 1 0x01 0x01 0x02 0x02 …………………… …………………… 0xFF 0xFF packet_counter tmp_matrix

26 packet 2 …… ………… ………… …… . 1 2 …… . 1 …………………… …………………… packet_counter
1 2 3 4 62 …… 1 2 ………… ………… 63 1 2 63 0x00 …… . 1 2 0x00 …… . 1 0x01 0x01 0x02 0x02 …………………… …………………… 0xFF 0xFF packet_counter tmp_matrix

27 packet 10 …… ………… ………… …… . 2 3 1 …… . 1 …………………… ……………………
1 2 3 4 62 63 …… 1 2 ………… ………… 63 1 2 63 0x00 …… . 2 3 1 0x00 …… . 1 0x01 0x01 0x02 0x02 …………………… …………………… 0xFF 0xFF packet_counter tmp_matrix

28 …… . ………… 1 …………………… ………… …………… tmp_matrix flow_byte_counters 1 2 63
1 2 ………… 63 0x00 …… . 1 0x01 0x02 …………………… 1 2 ………… 63 0x00 …… . 0x01 1000 1000 1000 0x02 1000 1000 1000 …………… 1000 1000 0xFF tmp_matrix 0xFF flow_byte_counters

29 Architecture of GENESIS (cont.)
Gray area: the number of packets whose i-th byte value is j is greater than 10, and the total traffic volume of the flows which have such packets is more than 90% of the total traffic volume in bytes 2019/5/3

30 Architecture of GENESIS (cont.)
SignatureFinder collects every j where sigmatrix[i][j] is nonzero (0i63, and j is the integer value of the i-th byte of the payload). The sequence of the collected j’s is recorded as a signature. A new signature is generated, it is put in the signature list. 2019/5/3

31 Experiment Experiments with the GENESIS system at the main link of the CNU campus network. This link is 1 Gigabit Ethernet, and the peak bandwidth usage is around 200Mbps. With the captured 12-hour traffic traces (11/Nov/ :00 ~ 12/Nov/ :00). 2019/5/3

32 Experiment (cont.) 2019/5/3

33 Experiment (cont.) 2019/5/3

34 Experiment (cont.) 2019/5/3

35 Experiment (cont.) 2019/5/3

36 Conclusion P2P applications hide their ports, signature-based traffic monitoring has been useful and practical for assessing P2P applications. Maintaining signatures up-to-date for a lot of P2P applications are difficult and time-consuming. Propose a method to find signatures for P2P application traffic automatically. This method could be extended to the brand new P2P applications in the future work. 2019/5/3


Download ppt "Byung-Joon Lee and Youngseok Lee"

Similar presentations


Ads by Google