Presentation is loading. Please wait.

Presentation is loading. Please wait.

Revealing Skype Traffic: When Randomness Plays with You D. Bonfiglio 1, M. Mellia 1, M. Meo 1, D. Rossi 2, P. Tofanelli 3 Dipartimento di Elettronica,

Similar presentations


Presentation on theme: "Revealing Skype Traffic: When Randomness Plays with You D. Bonfiglio 1, M. Mellia 1, M. Meo 1, D. Rossi 2, P. Tofanelli 3 Dipartimento di Elettronica,"— Presentation transcript:

1 Revealing Skype Traffic: When Randomness Plays with You D. Bonfiglio 1, M. Mellia 1, M. Meo 1, D. Rossi 2, P. Tofanelli 3 Dipartimento di Elettronica, Politecnico di Torino 1 ENST T é l é com Paris 2 Motorola Inc. 3 ACM Sigcomm 2007 Presented by Te-Yuan Huang

2 Outline  Goal  Contribution  Know More about Skype  Classifiers  Experiments  Conclusions

3 Outline  Goal  Contribution  Know More about Skype  Classifiers  Experiments  Conclusions

4 Goal  Identify Skype Traffic among aggregated traffic Direct session Either UDP or TCP  The algorithm should be Work in Real-Time Reliable Able to detect short flows (only last several seconds)

5 Outline  Goal  Contribution  Know More about Skype  Classifiers  Experiments  Conclusions

6 Importance of Skype Traffic Identification  Interest of network operator Network Design & Provisioning Traffic and Performance Monitoring Tariff Policies Traffic Differentiation

7 Difference from Related Work  K.T. Chen et al. “ Quantifying Skype USI ” Only identify UDP traffic Need Skype login phase to be monitored  Fail on backbone links  Fail if any modification on Skype login proc.  K. Suh et al. “ Characterizing and Detect relayed traffic: A case study using Skype ” Only identify relayed Skype traffic

8 Outline  Goal  Contribution  Know More about Skype  Classifiers  Experiments  Conclusions

9 Let ’ s get hands dirty – Know more about Skype traffic sources A Skype Message

10 Skype Parameters  Rate Codec Rate  Delta T Skype Message Framing Time  The time between two subsequent Skype Message  RF (Redundancy Factor) The number of past blocks that Skype retransmits

11 Parameters changes on Network Conditions

12 Skype Communication Mode  End-to-End (E2E) Skype user call Skype user  End-to-Out (E2O) Skype-in/Skype-out PSTN involved Only voice data  No video / file transfer / IM

13 Skype Codec  Codecs Automatically selected ISAC  The preferred codec for E2E G.729  The preferred codec for E2O

14 More on Skype Message  Skype encrypt the message TCP:  Reliable transport  Receive packet in correct sequence (from application layer point of view)  encrypt the whole content of the message UDP:  Unreliable  Maybe out-of-order  Application layer header is needed to resolve incorrect order Only can be obfuscated  Only encrypt partial message

15 TCP E2E Message  All ciphered 123Byte Frame

16  Identified Field ID: 16-bit long identifier.  Randomly selected Fun: 5-bit long field masked by 0x8f  Used to stating the payload type 0x02, 0x03, 0x07,0x0f : signaling message 0x0d : Data message (all 4 types DATA)  Not Random, but obfuscate (Mixed) Frame: ciphered information UDP E2E Message 1234Byte … ID FunFun Frame

17  Identified Field CCID: 4 bytes  Connection Identifier (CID) of PSTN gateway  Deterministic After initial signaling E2O Message 1234Byte … CIDFrame

18 Outline  Goal  Contribution  Know More about Skype  Classifiers  Experiments  Conclusions

19 How to Identify Skype Traffic?  Chi-Square Classifier (CSC) Utilize the knowledge of ciphering mechanism  Na ï ve Bayes Classifier (NBC) Utilize the general characteristics of VoIP traffics  Payload-Based Classifier (PBC) Look into the non-ciphered SoM Only used for traffic in UDP

20 Chi-Square Classifier (CSC)  Purpose: To Know whether message portion is encrypted  Rationale Given a message,  Only the third bytes is not random Probably, E2E Skype flow by UDP  The first four bytes are deterministic, others are ciphered Probably, E2O Skype flow by UDP  The whole message is ciphered Probably, Skype flow transported by TCP

21 Chi-Square Classifier (CSC) – Cont.  Chi-Square Distr. Observing the objects ’ ouput for n TOT times There are n possible output For i th output, it is expected to occur E i times among n TOT, and is observed to occur O i times Then, is Chi-Square Distr. With n-1 degree of freedom

22 Chi-Square Classifier (CSC) – Cont.  For each flow, take first G group of b bits  For each group g, there are 2 b possible output  If the content of the flow is random, then E i for each group is n TOT / 2 b b bits …..b bits 123G ….. ……

23 Chi-Square Classifier (CSC) – Cont.  Evaluate the test statistic as:  Define the thresholds by

24 Chi-Square Classifier (CSC) – Cont.  G = 16, b = 4bits are used  E2E over UDP The block g = 5 or 6 is mixed Others are random Classified Criteria

25 Chi-Square Classifier (CSC) – Cont.  E2O over UDP  E2E or E2O over TCP  Not Skype Otherwise

26 Chi-Square Classifier (CSC) – Cont.  Deterministic test satistics Linear with n TOT

27 Chi-Square Classifier (CSC) – Cont.  Mixed block: If one bit is fixed and the others are random Linearly increase with n TOT

28 Chi-Square Classifier (CSC) – Cont.

29  Chi-Square works only if the observation is large enough, that is E i = n TOT /2 b >=5  Namely, n TOT >= 80  Choose n TOT = 100  Also, set

30 Na ï ve Bayes Classifier  Feature vector x = [x i ]  P{C|x} : the probability that the object is belong to class C, given the feature x is observed  P{x|C}: the probability that the feature x will be observed, given the object is belong to class C  Bayes Rule P{C|x} = P{x|C}P{C} / P{x}

31 Na ï ve Bayes Classifier – cont.  Na ï ve : features are independent  P{x|C} called belief

32 NBC – Feature Selection  VoIP Small Message Size Less burstier than data traffic  Feature Message size  Observe a window of message at a time x = [s 1, s 2, …, s w ] Average-Inter Packet Gap (average-IPG)

33 NBC – Feature Selection  Belief  How to determine P{s i |C} &

34 NBC – Feature Characterization  For each codec, the message size is determined by Rate Header length Redundancy factor (RF) Message framing time (delta T)  The message size can be represented by Gaussian distribution

35 NBC – Feature Characterization  Map each codec to a Gaussian distr.  Model average-IPG to a Gaussian distr. with For Constant Bit Rate Codec For variable Bit Rate Codec

36 NBC – Derive Beliefs

37 NBC – Make Decision  Let  Define a threshold B min If B > B min  Valid Skype flow Otherwise  Not Skype flow

38 Payload Based Classifier (PBC)  Used as cross check for previous two classifier  Only useful for UDP traffic  Two Part Per-flow Identification Per-host Identification

39 PBC - Per-flow Identification Utilize the knowledge about UDP E2E Message Fun: 5-bit long field masked by 0x8f  Used to stating the payload type 0x02, 0x03, 0x07,0x0f : signaling message 0x0d : Data message (all 4 types DATA) 1234Byte … ID FunFun Frame

40 PBC - Per-flow Identification  Terminology n TOT : the total number of packets in the flow n sig : the number of Skype signaling message n E2E : the number of Skype E2E data/video/chat/voice message n E2O : the number of Skype E2O voice message

41 PBC - Per-flow Identification  Criteria

42 PBC - Per-host Identification  Known: a Skype client always uses the same UDP port to send/receive traffic  Before start conversation, Signaling messages are sent between two clients  Able to identify a Skype client running at a specific IP and port

43 PBC - Per-host Identification  Criteria to identify the Skype client IP/port

44 Experiment  Two Data Set Campus – 95 hours took on 2006/5/29  No P2P traffic is allowed  Most traffic are TCP data flows ISP – one day took on 2006/5/15  All traffic is allowed  More heterogeneous  Expect little Skype traffic

45 Measurement Result

46 Measurement Result – UDP, Campus

47 Measurement Result – UDP, ISP

48 Measurement Result - TCP

49 Parameter Tuning - B min

50 Parameter Tuning – X 2 (Thr)

51 Parameter Tuning – B min & X 2 (Thr)

52

53 Conclusion  Reveal Skype Traffic from aggregate streams of packets  Two Approach Statistical properties of randomness Stochastic characteristics of voice traffic  Negligible False Positives  Few False Negative left out


Download ppt "Revealing Skype Traffic: When Randomness Plays with You D. Bonfiglio 1, M. Mellia 1, M. Meo 1, D. Rossi 2, P. Tofanelli 3 Dipartimento di Elettronica,"

Similar presentations


Ads by Google