Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classification of Applications in HTTP Tunnels By Gajen Piraisoody, Changcheng Huang,Biswajit Nandy, Nabil Seddigh Electrical and Computer Engineering.

Similar presentations


Presentation on theme: "Classification of Applications in HTTP Tunnels By Gajen Piraisoody, Changcheng Huang,Biswajit Nandy, Nabil Seddigh Electrical and Computer Engineering."— Presentation transcript:

1 Classification of Applications in HTTP Tunnels By Gajen Piraisoody, Changcheng Huang,Biswajit Nandy, Nabil Seddigh Electrical and Computer Engineering Electrical and Computer Engineering Carleton University. Ottawa, ON. Canada. 12 November 2013

2 Slide 2 Outline Overview Motivation Problem Statement Contribution Approach to classification Evaluation Conclusion

3 Slide 3 Overview – HTTP Tunnel What is HTTP Tunnelled Traffic? HTTP port used to carry web traffic Non-HTTP applications are wrapped in HTTP protocols HTTP port now tunnels , chat, video, image, audio, file-transfer and peer to peer traffic Why HTTP Tunnel non-HTTP applications? HTTP clients (browser) are readily available and deployable Tunneling permits applications to by-pass restricted network connectivity that exists in the form of firewalls, proxy and NAT

4 Slide 4 Motivation HTTP Traffic Classification HTTP traffic in an entire network is about 80% HTTP tunneled traffic is not identifiable by ports alone Tunneled traffic like YouTube and Netflix is increasing in cloud network Info on tunneled traffic helps cloud-centre management with planning, provisioning and ensuring quality of service Why flow-based against DPI classification process? Provides a scalable software solution(less CPU consumption) Can classify encrypted data

5 Slide 5 Problem Statement Given network traffic measured with NetFlow Find a way to classify HTTP tunnelled traffic Audio (Radio & Music), Video and File-transfer No training dataset needed for the proposed algorithm Use information available from NetFlow only

6 Slide 6 Contribution Proposed scheme classifies HTTP tunneled traffic: audio(radio & music), video and file-transfer Proposed scheme helps audio classification by using occupancy feature Proposed scheme enhances classification performance by including flow-group found using flows from Content Servers(subnet masked IP of long-flow)

7 Slide 7 Approach in detail Identify long-flow HTTP traffic Parameter : BPF Classify radio traffic Parameter : BPF, BPP, BPS, Occupancy Classify music traffic Parameter : BPF, BPP, BPS, Occupancy Classify video traffic Parameter : BPF, BPP, BPS, Flow-group Classify file-transfer traffic Parameter : BPF, BPP, BPS, Flow-group Bytes-per-second(BPS), Bytes-per-flow(BPF), Bytes-per-pkt(BPP)

8 Slide 8 Approach to Classification Identify Long-flow HTTP Traffic Classify Audio TrafficClassify Video & File-transfer Traffic

9 Slide 9 Identify Long-flow HTTP Traffic Identifying HTTP Traffic Long-flow has byte size larger than a threshold Audio, video and file-transfer are generally long-flow HTTP_PORTS80, 443, 1935, 8008, 8080, 8088, 8090

10 Slide 10 Identify Long-flow HTTP TrafficClassify Audio TrafficClassify Video & File-transfer Traffic Approach

11 Slide 11 Classify Audio Traffic 99.4 % of radio rates are between 20 and 320 Kbps (Statistics from 3683 online radio web sites) 98% of online music rates are between 64 and 320Kbps (Statistics from >20 online music sites) 95% Confidence Interval of radio bytes-per-packet are between 900 and 1470 (Samruay et.al [1]) 95% Confidence Interval of music bytes-per-packet are between 1260 and 1500 (Samruay et.al [1])

12 Slide 12 Classify Audio Traffic Behavioral analysis: Online audio listener typically listens to audio for more than 5 minutes There are two distinct audio types : Radio & Music(songs) New concept : Occupancy helps classify audio. Occupancy is a ratio of the flow duration over the entire duration of a chunk of time.

13 Slide 13 Classify Audio Traffic Difference between Radio & Music Continuous - Radio contents appears to download every second of the flow Dirac - Songs in a playlist are downloaded & played one at a time The max/min size of a radio flow is dependent on maximum flow-period configuration and the offered radio rates The max/min size of a music flow is dependent on max/min song duration and offered online music rates 95% confidence interval of radio occupancy from DS- 1,DS-2,SME-6,SME-7 and SME-8 is 82%,100% 95% confidence interval of music occupancy from DS-1,DS-2,SME-6,SME-7 and SME-8 is 0%,55% Assumption : Minimum number of radio-flows are two (5 minutes at least) Assumption : Minimum number of music-flows are two ( 5 minutes at least) Assumption : Maximum radio-phase timeout is based on a flow-period(120 seconds) Maximum music-phase timeout is based on maximum song duration (382 seconds)

14 Slide 14 Approach Identify Long-flow HTTP TrafficClassify Audio TrafficClassify Video & File-transfer Traffic

15 Slide 15 CDNs Authoritative DNS Server ClientServer 1) Client clicks on audio/video hyperlink 2) Metafile sent to client 3) Metafile Listening HTTP Server CDN_1 Web Browser Media Player 8) Request multimedia content 1 5) Responds with CDN site 6) FromDNS lookup,request sent tio CDN admin 7) Responds with address of all contents on all CDNs CDN_n 4) Request multimedia content 9) Request multimedia content 2 10) Content1 11) Content2 Background Multimedia Distribution (3 types)

16 Slide 16 Classify Video & File-transfer Traffic Video flow-attributes (bytes-per-packet, bytes-per-flow, download rates) & flow-group technique (FG) are used to classify video & file-transfers Flow-group (FG) Video flow is associated with meta-data, style sheet, advertisements Kei.et.al[3] defined FG as the number of flows that occur within a few seconds of video-flow with same destination-IP address Our expanded flow-group also includes flows that occur within a longer duration that have the same subnet masked source-IP address and the same destination-IP address

17 An Example Slide 17

18 Example cont`d Slide 18

19 Slide 19 Classify Video & File-transfer Traffic

20 Slide 20 Classify Video & File-transfer Traffic Start Gather potential V/F flows flow > 0.5MB & > 1260 bytes-per-pkt & > 128Kbps & order by destination-IP and flow start time End For every potential V/F flow, gather potential flow-group(FG) flows when: FG flow > V/F start-time – 4 &FG flow < V/F start-time + 1 & FG flow and V/F has same dest-IP & FG flow between 1000B and 0.5 MB & FG flow between 200 and 1500 BPP For V/F-phase gather potential FG flows: Same source IP address-subnet Same destination IP address & FG flow > V/F start-time – 60 &FG flow < V/F start-time + 10 & FG flow between 1000B and 0.5 MB & FG flow between 200 and 1500 BPP If FG == true: inc FG counter If FG == true: inc FG counter If FG >0: Label video else: Label file-transfer Green is original flow-group(FG), Yellow is improvised flow-group. Both FG are run :

21 Slide 21 Evaluation Datasets used to test algorithms Accuracy measurement assessment Precision is the systems correct predictions against all predicted value. That is precision = TP / (TP+FP) Recall is the systems correct predictions against all actual correct value. That is recall = TP / (TP + FN) F-Measure is the harmonic mean of recall and precision. That is F- measure => 2 * Precision * Recall / (Precision + Recall) accuracy = TP + TN / (TP + FP + FN + TN) – true results Compare against other algorithms NaïveBayes SVM (Support Vector Algorithm)

22 Slide 22 Evaluation – Datasets SME-6SME-7SME-8 Date1/7/20131/22/2013 1/23/2013 Duration(s) Start-time (GMT-5)10:18:0410:29:0410:56:20 Flows Packets Bytes HTTP Flows HTTP Packets HTTP Bytes

23 Slide 23 Evaluation – Results

24 Slide 24 Evaluation – Results

25 Slide 25 Conclusion Proposed algorithm uses flow-based approach and classifies high percentage of tunneled traffic : audio, video and file-transfer Proposed audio algorithm: Used a concept called occupancy to classify radio & music traffic Proposed video & file-transfer algorithm Used improvised flow-group method to help increase classification accuracy of video and file-transfer traffic Proposed schemes F-measure is at least 10% more than NaiveBayes and SVM

26 Slide 26 Reference [1] Samruay Kaoprakhon, Vasaka Visoottiviseth, "Classification of Audio and Video Traffic over HTTP Protocol," in Communications and Information Technology, ISCIT th International Symposium on, Sept 2009 [2] M. Twardos, "The Information Diet," [Online]. Available: [Accessed 2013]http://theinformationdiet.blogspot.ca/2011/11/probability-distribution-of-song-length.html [3] K Takeshita, T Kurosawa, M Tsujino and M Iwashita, "Evaluation of HTTP Video Classification Method Using Flow Group Information," in Telecommunications Network Strategy and Planning Symposium (NETWORKS), th International, Sept [4] H.Kim, K.Claffy, M.Fomenkov, D.Barman, M.Falutsos, K.Lee, " Internet Traffic Classification Demystified: Myths, Caveats, and the Best Practices Classification of Audio and Video Traffic over HTTP Protocol," in ACM, 2008 [5] POWERS, D.M.W. EVALUATION: FROM PRECISION, RECALL AND F-MEASURE TO ROC, INFORMEDNESS, MARKEDNESS & CORRELATION," in Journal of Machine Learning Technologies, Volume 2, Issue 1, 2011, pp-37-63


Download ppt "Classification of Applications in HTTP Tunnels By Gajen Piraisoody, Changcheng Huang,Biswajit Nandy, Nabil Seddigh Electrical and Computer Engineering."

Similar presentations


Ads by Google