Presentation is loading. Please wait.

Presentation is loading. Please wait.

Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon.

Similar presentations


Presentation on theme: "Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon."— Presentation transcript:

1 Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon

2  Peer-to-peer systems have become increasingly popular ◦ Millions of simultaneous users ◦ Significant percentage of Internet traffic  is one of the most popular p2p applications ◦ Responsible for 35% of all Internet traffic [Parker05]  BitTorrent is important because ◦ Popularity ◦ Its impact on the network

3 3  Scalable one to many peer-to-peer file distribution  Overlay: Unstructured, Random, High degree  Swarming ◦ File is divided into segments ◦ Segments are randomly distributed among peers – Get rarest seg. first  Contribution ◦ Peers exchange segments and contribute their outgoing bandwidth ◦ Incentive: Tit-for-Tat  Tracker ◦ Torrent coordinator ◦ Periodic peer status updates  Performance: Intuitively depends on ◦ Peer properties (BW, Contribution, etc. ) ◦ Group properties (Population, Content availability, Churn) Introduction

4 4 1. Modeling and analytical studies 2. Simulation studies 3. Empirical studies ◦ Capture BitTorrent system properties in operation through measurement (instrumented clients)[Legout06] ◦ Group properties[Izal04]: Population, Average cont. avail.,.. ◦ No explicit notion of performance ◦ No study on the effects of underlying factors of peer performance Related work  Characterization: ◦ Understanding group-level and peer-level properties in a torrent  Analysis: ◦ What are the main factors that affect observed performance by individual peers?

5 5  Common approach: Instrumented clients ◦ Detailed and flexible ◦ Representative?  Our approach: Tracker logs ◦ Coarse granularity(30 min) ◦ Global view  Data Sets Methodology/Approach Tracker Logfile Tracker Source #Torre nts Start Date End Date #Rep orts #Sessio ns RedHat13/038/032M170k Debian15992/053/0532M1268k Games25858/0312/0438M4416k TorrentFile Size# Sessions, rank Duration RedHat1.8GB170k, 3rd146d Debian677MB139k, 6th51d Games363MB195k, 2th66d Tracker logs setsSelected Torrents

6 6  Session: ◦ Set of all updates from a particular peer from its arrival till departure  Peer-level properties: ◦ Represent the peer’s status during a session: ◦ Average download rate ◦ Average upload rate Methodology Slope = upload rate Download Complete Session Start Studied zone(leeching) Download rate Slopes= upload rates Downloa d rates Avg download rate

7 7  Population, Avg. Content Availability, Churn  Sampling approach: ◦ Once every τ minutes ◦ Last update before and first update after each sample ◦ Interpolation ◦ Averaging across peers  τ determines sampling resolution  τ > average update interval  Peer view: ◦ Average of the samples during peer’s download time Measurement methodology Update Time τ

8 8  Is Download Rate a good performance metric ? ◦ A reference is needed to evaluate peer’s download rate ◦ Ideally peer performance is: ◦ Accurate measurement of Utilization is difficult  We use maximum observed download rate as a (lower bound) estimate for incoming bandwidth.  Standard deviation of download rate captures stability of download rate ◦ Rates close to avg.  higher performance ◦ Normalization  comparability  Two performance metrics: Methodology

9  Similar distribution across 3 different torrents  Utilization has an almost uniform distribution ◦ Nearly Fixed probability density  90% show closely uniform distribution  Diverse performance  No dominant modes 9 Characterization Results/Peer-Properties

10 10  Content availability ◦ 75% of peers in RH observe an average cont. avail. of 50% ◦ No content shortage  Avg. Population ◦ Very different ◦ Flash crowd in RH Characterization Results/Peer-view of group properties Initial flash crowd

11 11 Underlying factors  Remember the second questions ◦ What are the peer- or group-level properties that primarily determine the observed performance by individual peers in a torrent?  Performance metrics: ◦ Utilization and Stability  Possible Underlying factors: ◦ Group-level properties: Population, Churn, Content avail. ◦ Peer-level properties: Upload rate, etc.  Approach To Identify Underlying factors ◦ Scatter-plot ◦ Linear Regression (Using S-plus) ◦ Spearman’s rank correlation (S-Plus)

12 12  Utilization vs. Average group content availability ◦ No obvious correlation  Utilization vs. Average group population ◦ Vertical patterns ◦ No obvious correlation Statistical Analysis/Scatter-plots

13 13 Suggested techniques result in marginal improvement (R-squared) No single parameter with dominant effect Seed percentage was removed by step()  suggests number of seeds is sufficient Statistical Analysis/Linear Regression Several values to consider: R-Squared determines goodness of fit [0:1] P-value determines: “Probability of obtaining a result as impressive” just by chance

14 14 Highest correlation with deviation of upload rate for all torrents -> Tit-for-tat effect Two perf. metrics are similarly affected with opposite signs GA: Little correlation with util. -> unreliable metric DE: Slightly larger effect from content avail. Statistical Analysis/Spearman’s Rank correlation

15 15  Conclusions ◦ No single factor determines observed performance by peers ◦ Outgoing bandwidth seems to have the largest effect  Tit-for-tat is working ◦ There often appears to be sufficient number of seeds available (non-factor on performance) ◦ Capturing comparable performance is hard ◦ Performance of the peers in a torrent is rather diverse  Instrumented clients cannot reflect a representative picture.  Future work ◦ Active monitoring of BitTorrent ◦ BitTorrent overlay topology using peer exchange feature ◦ Characterizing new features:  DHT, super-seeding, peer exchange

16


Download ppt "Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon."

Similar presentations


Ads by Google