Presentation is loading. Please wait.

Presentation is loading. Please wait.

#16 Application Measurement Presentation by Bobin John.

Similar presentations


Presentation on theme: "#16 Application Measurement Presentation by Bobin John."— Presentation transcript:

1 #16 Application Measurement Presentation by Bobin John

2 1 st paper: Measurement, Modeling & Analysis of a Peer-to-Peer File- Sharing Workload (KaZaa paper)

3 KaZaa paper  P2P file sharing is the most dominant  This paper deals with KaZaa  200-day trace is taken  Model is developed  Locality-awareness can improve KaZaa performance

4 KaZaa paper  Trace Methodology  KaZaa trace summary statistics  KaZaa “usernames” used  KaZaaLite … IPs used  Easy to distinguish KaZaa-specific HTTP headers  Auto-update transactions filtered out

5 KaZaa paper  User Characteristics  KaZaa users are patient

6 KaZaa paper  User Characteristics  Users slow down as they age  2 reasons: attrition & slowing down over time

7 KaZaa paper  Client Activity

8 KaZaa paper  Object Characteristics  Diverse workload

9 KaZaa paper  Object Characteristics  Object Dynamics  Clients fetch objects at most once  Popularity of objects is often short-lived  Most popular objects tend to be recently born objects  Most requests are for old objects

10 KaZaa paper  Object Characteristics  NOT Zipf-like  Web access patterns follow the Zipf property

11 KaZaa paper  Model

12 KaZaa paper  Model for P2P file-sharing workloads  Model Description

13 KaZaa paper  Model for P2P  File-Sharing effectiveness diminishes with client age

14 KaZaa paper  Model for P2P  New Object Arrivals improve performance

15 KaZaa paper  Model for P2P  New clients cannot stabilize performance

16 KaZaa paper  Model for P2P  Model validation

17 KaZaa paper  New idea!  How to reduce bandwidth cost?  Use a proxy cache Legal & political problems  Locality-aware request routing Centralized request redirection redirector Decentralized request redirection supernodes

18 KaZaa paper  Locality awareness  Methodology  Benefits

19 KaZaa paper  Locality awareness  Accounting for Hits & Misses

20 KaZaa paper  Locality awareness  Availability

21 KaZaa paper  Conclusion  KaZaa workload is different  Does not follow Zipf  Can be improved with locality awareness  Drawbacks  A trace from a university ought not to be generalized to all KaZaa/P2P applications  Further implementation details of locality- awareness?  Scope of use for such a locality awareness tool?  I don’t think universities would like this

22 2 nd paper: An analysis of Internet Chat systems

23 Chat paper  Why is chat a worthwhile target for traffic characterization?  Chat offers computer mediated communication  Used by a large number of people … potential of being habit-forming

24 Chat paper  Different types of chat systems:  Internet Relay Chat [IRC]  Web-based chat systems  ICQ & AIM  Gale

25 Chat paper  Problem in analyzing chat traffic  Multitude & diversity of systems & protocols  Chat protocol realized on top of HTTP protocol … difficult to separate chat traffic  Resource limitations due to filtering demands

26 Chat paper  IRC  Set of connected servers  Client connection requests on port 6667  Unique nicknames  Discussion channels  Channel operators  Medium to share data  IRC operator

27 Chat paper  Web-chat  Not tty-based … Web browser interface  A single server to connect to  3 classes of chat systems:  HTML-Web-Chat  Applet-Web-Chat  Applet-IRC-Chat  Difference between IRC & Web-chat is only “social”

28 Chat paper  Identifying IRC chat traffic  Packet monitor that captures all TCP traffic involving port 6667  Can only capture text & control messages  Data/file transfers cannot be captured as they run on other TCP connections  IRC’s packet size distribution is mainly dominated by small packets  IRC session should last more than a few minutes  IRC sends keep-alive messages

29 Chat paper  Identifying Web-chat traffic  HTML-Web-chat:  Appropriate cache-control-headers  Adding state information  Cache-Control: Must-revalidate & Cache-Control: Private indicates non- chat traffic  Use of scripting languages e.g.,Javascript  Use of applet windows e.g., Java

30 Chat paper  Identifying Web-chat traffic  Applet-Web-chat:  User would have accessed a Java file or a script or even a page like “xxxchatyyy” … “chat” could occur even in the path

31 Chat paper  Overall strategy for extracting chat traffic

32 Chat paper  Overall strategy for extracting chat traffic  Repeat this process  Identify traffic that cannot be chat traffic  Remove it  Steps that filter out more non-chat traffic has to be implemented earlier  Other steps that need more processin gor pre-processing should be implemented later

33 Chat paper  Overall strategy for extracting chat traffic  Eliminate traces from ports < 1024 except port 80  Also eliminate trace from well-known application ports (e.g., Gnutella - 6346)  Group packets into flows  Mark & filter them according to the previous table

34 Chat paper  Experiment  At University of Saarland  Resource partitioning  Traces were generated after filtering  950GB > 1.2GB > 238MB (WEBCHAT1)  192MB (IRC1)  350MB (WEBCHAT2)

35 Chat paper:  Validation  2 aspects:  Recall – ability of a system to present all relevant items  Precision – ability of a system to present only relevant items

36 Chat paper  Validation  Lots of calculations “we can expect to locate about 91.7% of all real chat connections and that we expect that at least 93.1% of all connections we identify are indeed chat connections. “

37 Chat paper  Results  Session durations

38 Chat paper  Results  Interarrival times of sessions

39 Chat paper  Results  Packet sizes

40 Chat paper  Results  Sent & Received bytes

41 Chat paper  Conclusion  Chat-traffic was successfully filtered out  Accuracy was above 90%  Drawbacks  Use of this work?


Download ppt "#16 Application Measurement Presentation by Bobin John."

Similar presentations


Ads by Google