BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results.

Slides:



Advertisements
Similar presentations
A Measurement Study of Peer-to-Peer File Sharing Systems Presented by Cristina Abad.
Advertisements

BitTorrent or BitCrunch: Evidence of a credit squeeze in BitTorrent?
Optimal Scheduling in Peer-to-Peer Networks Lee Center Workshop 5/19/06 Mortada Mehyar (with Prof. Steven Low, Netlab)
Rarest First and Choke Algorithms Are Enough
Rarest First and Choke Algorithms are Enough Arnaud LEGOUT INRIA, Sophia Antipolis France G. Urvoy-Keller and P. Michiardi Institut Eurecom France.
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
The BitTorrent protocol A peer-to-peer file sharing protocol.
Incentives Build Robustness in BitTorrent Bram Cohen.
Cameron Dale and Jiangchuan LiuA Measurement Study of Piece Population in BitTorrent Introduction BitTorrent Experiment Results Simulation Discussion A.
A simple model for analyzing P2P streaming protocols. Seminar on advanced Internet applications and systems Amit Farkash. 1.
Stochastic Analysis of File Swarming Systems The Chinese University of Hong Kong John C.S. Lui Collaborators: D.M. Chiu, M.H. Lin, B. Fan.
Clustering and Sharing Incentives in BitTorrent Systems Arnaud Legout 1, Nikitas Liogkas 2, Eddie Kohler 2, Lixia Zhang 2 1 INRIA, Projet Planète, Sophia.
On the Effectiveness of Measurement Reuse for Performance-Based Detouring David Choffnes Fabian Bustamante Fabian Bustamante Northwestern University INFOCOM.
Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.
Modelling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks.
Analyzing and Improving BitTorrent Ashwin R. Bharambe ( Carnegie Mellon University ) Cormac Herley ( Microsoft Research, Redmond ) Venkat Padmanabhan (
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada ISP-Friendly Peer Matching without ISP Collaboration Mohamed Hefeeda (Joint.
A survey of BitTorrent study Jian Liang EL933 Prof. Yong Liu.
Network Coding for Large Scale Content Distribution Christos Gkantsidis Georgia Institute of Technology Pablo Rodriguez Microsoft Research IEEE INFOCOM.
Beneficial Caching in Mobile Ad Hoc Networks Bin Tang, Samir Das, Himanshu Gupta Computer Science Department Stony Brook University.
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
1 A Framework for Lazy Replication in P2P VoD Bin Cheng 1, Lex Stein 2, Hai Jin 1, Zheng Zhang 2 1 Huazhong University of Science & Technology (HUST) 2.
Improving ISP Locality in BitTorrent Traffic via Biased Neighbor Selection Ruchir Bindal, Pei Cao, William Chan Stanford University Jan Medved, George.
Peer-Assisted Content Distribution Networks: Techniques and Challenges Pei Cao Stanford University.
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
Modeling and analysis of BitTorrent-like P2P network Fan Bin Oct,1 st,2004.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Kyushu University Graduate School of Information Science and Electrical Engineering Department of Advanced Information Technology Supervisor: Professor.
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
1 Drafting Behind Akamai (Travelocity-Based Detouring) AoJan Su, David R. Choffnes, Aleksandar Kuzmanovic, and Fabian E. Bustamante Department of Electrical.
Performance Evaluation of Peer-to-Peer Video Streaming Systems Wilson, W.F. Poon The Chinese University of Hong Kong.
On-Demand Media Streaming Over the Internet Mohamed M. Hefeeda, Bharat K. Bhargava Presented by Sam Distributed Computing Systems, FTDCS Proceedings.
Tradeoffs in CDN Designs for Throughput Oriented Traffic Minlan Yu University of Southern California 1 Joint work with Wenjie Jiang, Haoyuan Li, and Ion.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
1 Proceeding the Second Exercises on Computer and Systems Engineering Professor OKAMURA Laboratory. Othman Othman M.M.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
1 One-Click Hosting Services: A File-Sharing Hideout Demetris Antoniades Evangelos P. Markatos ICS-FORTH Heraklion,
BitTorrent Presentation by: NANO Surmi Chatterjee Nagakalyani Padakanti Sajitha Iqbal Reetu Sinha Fatemeh Marashi.
Ofir Israel Guy Paskar. An Internet Tale Once upon a time.. Users unhappy (slow connection) ISPs unhappy (poor revenues) Then came Broadband access...
BitTorrent Internet Technologies and Applications.
Ao-Jan Su, David R. Choffnes, Fabián E. Bustamante and Aleksandar Kuzmanovic Department of EECS Northwestern University Relative Network Positioning via.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo 1, Songqing Chen 2, Zhen Xiao 3, Enhua Tan 1, Xiaoning Ding 1, and Xiaodong Zhang.
Taming the Torrent: A Practical Approach to Reducing Cross-ISP Traffic in Peer-to-Peer Systems David R. Choffnes and Fabián E. Bustamante Speaker: Wally.
Network Aware Resource Allocation in Distributed Clouds.
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
Do incentives build robustness in BitTorrent? Michael Piatek, Tomas Isdal, Thomas Anderson, Arvind Krishnamurthy, Arun Venkataramani.
MULTI-TORRENT: A PERFORMANCE STUDY Yan Yang, Alix L.H. Chow, Leana Golubchik Internet Multimedia Lab University of Southern California.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
Othman Othman M.M., Koji Okamura Kyushu University 1.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo, Ph.D. Candidate Park Graduate Research Award Presentation.
On Reducing Mesh Delay for Peer- to-Peer Live Streaming Dongni Ren, Y.-T. Hillman Li, S.-H. Gary Chan Department of Computer Science and Engineering The.
A simple model for analyzing P2P streaming protocols. Seminar on advanced Internet applications and systems Amit Farkash. 1.
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
1 IP Routing table compaction and sampling schemes to enhance TCAM cache performance Author: Ruirui Guo, Jose G. Delgado-Frias Publisher: Journal of Systems.
Peer-to-Peer Networks 10 Fast Download Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Drafting Behind Akamai (Travelocity-Based Detouring) Ao-Jan Su, David R. Choffnes, Aleksandar Kuzmanovic and Fabián E. Bustamante Department of EECS Northwestern.
An example of peer-to-peer application
Measurements, Analysis, and Modeling of BitTorrent-like Systems
FairTorrent: BrinGing Fairness to Peer-to-Peer Systems
Introduction to BitTorrent
Managing Inter-domain Traffic in the Presence of BitTorrent File-Sharing Srinivasan Seetharaman and Mostafa Ammar School of Computer Science Objective:
Simplified Explanation of “Do incentives build robustness in BitTorrent?” By James Hoover.
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
Content Distribution Networks + P2P File Sharing
Presentation transcript:

BitTorrent

BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results

Publishing  How to publish a (usually large) file ?  Dedicated server: Easy to manage, Easy to find, Persistent service Nevertheless…  BitTorrent: Organizes multiple clients that share the same file Leveraging the upload bandwidth of the participants Self scaling, resilience, operates well in “Flash crowed” period Takes 50%-60% of all p2p traffic

The Basics of BitTorrent  Content provider (everyday people) wants to publish a file (the initial seed): Creates a meta file (*.torrent) Publish this (light) *.torrent file on a web server File is broken to small blocks ( KB) Uploads blocks to other peers The goal: to publish the file to many nodes by the help of other peers, with minimum load on the seeder

The Basics of BitTorrent  Third party (the tracker): A tracker site keeps track of the active participants + extra statistics. Upon requests from nodes, supplies a random subset of active nodes Receives updates from the active nodes Keeps track of new node joining the ‘torrent’ (or ‘swarm’) and nodes that left it

The Basics of BitTorrent  Peer (leecher) who is interested: Obtains the public *.torrent file Being directed to the tracker Obtains a list of random neighbors (~40) Downloads and uploads blocks to its ‘best’ neighbors (choking and unchoking) Upon download completion, becomes a seed

BitTorrent demo (Wikipedia)

BitTorrent basic schemes  (Immediate) Problems arise: Last block problem: assume nodes depart upon completion, how would leeches obtain the last block ? Free riding problem: willing to download, but unwilling to serve others  Simple and effective solutions: Last block problem: Nodes employ Local Rarest First policy Free riding problem: Nodes employ “Tit For Tat” policy, i.e. give more to those whom you accept more from

On the itinerary  One interesting limitation of BitTorrent networks BitTorrent provides poor service availability via analysis of tracker logs over long period  Proposition of peer selection approach Enables to lower costs and resources of ISPs Does not require ISP and peers cooperation Implementation of the proposition as part of Azureus client, codename ‘Ono’  Tradeoffs Performance (avg. Download time) vs. Fairness (avg. share ratio)

BitTorrent limitations  Taken from the paper “Measurements, Analysis and Modeling of BitTorrent-like systems” [1] Inspect overall performance in the lifetime of a torrent Analysis is based on traces A model is derived which is used to draw conclusions Verify that the derived conclusions match the observed behavior  Limitations were found: Poor service availability (coming next) Fluctuating download performance Unfair service to peers

Service availability - Analysis  Analysis is based on traces Tracker logs (~1500 torrents, sampled every 30 sec) Traces from servers that publish *.torrent files  Extracted data Identify peers Birth time of the torrent, size For each peer in the same torrent: arrival time, download & upload bandwidth, download & upload accumulative bytes

Service availability - Analysis  Y-axis at time t : The total number of requests for all torrents in the trace minus the cumulative number of requests for all torrents after time t, since they are born.  Similar observations are seen in the *.torrent metadata traces

Service availability - Analysis  This suggests an exponential decrease rate of requests, since a torrent is born  Notice that this is a cumulative measure  Does a specific torrent behaves the same? Use the least square method to measure how much a specific torrent deviates from this logarithmic fitting Analyze the average relative deviation distribution (which is mostly small – on average 6%)

Service availability – A model  Based on the observations, define the torrent popularity at time t as the peers arrival rate, which is the derivative of the of the peer arrival time distribution for that torrent.  Arrival rate: Where is the initial arrival rate when the torrent starts And is the attenuation parameter Both are evaluated from the observations

Service availability – A model  Define Torrent lifespan: duration from the birth to the time of no complete copy (thus new leeches would not be able to complete the download) Inter arrival time between two successive arriving peers could be approximated as Assume seeds leave the system at rate then the average service time for a seed is approximately

Service availability – A model  Look at consecutive peers that join the torrent Peer n and (n+1) join the torrent in time t(n) and t(n+1) respectively The inter arrival time between them is approximately Peer n downloads the file with speed u(n) and stays in the torrent for time duration When peer arrival rate is small enough (n is large), peer n+1, with speed u(n+1) <= u(n) could only be served by peer n

Service availability – A model Thus when peer n+1 can’t complete the download and the torrent is dead Using the definition of arrival rate, we get the torrent lifespan: Both and are extract from the trace (using linear regression), as well as Compare results from the model to what the trace holds Results match very well

Model vs. Observations Comparison of torrents lifespan: average lifespan according to trace is 8.89 and 8.34 based on the model

Service availability - Summery  Conclusions were obtained relying on extensive trace analysis and modeling Existing BitTorrent systems provides poor service availability This is due to the exponential decreasing peer arrival rate This provides strong motivation for inter-torrent collaborations

Next on the itinerary  One interesting limitation of BitTorrent networks BitTorrent provides poor service availability via analysis of tracker logs over long period  Proposition of peer selection approach Enables to lower costs and resources of ISPs Does not require ISP and peers cooperation Implementation of the proposition as part of Azureus client, codename ‘Ono’  Tradeoffs Performance (avg. Download time) vs. Fairness (avg. share ratio)

Reducing cross-ISP traffic  Taken from the paper “Taming the torrent” [2]  Motivation: overwhelming popularity of p2p (70% of internet traffic worldwide) yielded significant revenues for ISP.  However, p2p traffic significantly has increased ISP’s costs, particularly in terms of cross-ISP traffic  This has driven ISP to try and forcefully reduce p2p traffic Block specific ports, tricking clients to close connections Deep packet inspection Caching

Reducing cross-ISP traffic  One approach to alleviate this pain is to use an Oracle that provides knowledge about which peers are in the same ISP  This would benefit both ISPs and p2p community  But, this requires p2p users and ISP to collaborate and to trust each other  Not likely to be adopted

Reducing cross-ISP traffic  Another approach is to recycle data that is already being collected by Content Distribution Networks  CDNs attempt to improve web performance by redirecting requests to replica servers  The goal is to help content providers (i.e. CNN) to distribute content by redirecting requests to replica servers that are: Topologically proximate Provide lower-latency

CDNs as oracles  Hypothesis: when peers exhibit similar redirection behavior, they are likely to be close to the replica server, and thus to each other  Represent redirection behavior using ratio-maps  Each ratio represents the frequency of redirecting to a specific replica  Number of replicas is usually small (max 31)  Keep a time window (~ a day)

CDNs redirections as ratio-maps  The ratio map of a peer is a set of (replica server, ratio)  for peer a  Specifically, if peer a is redirected toward replica server r1 75% of the time window, and toward replica server r2 25% of the time window, then the corresponding ratio-map is  The sum of all in a given ratio map equals one

Similarity via ratio-maps  Define a metric that, given two peers, produces a value describing the similarity between the peers’ redirections behavior  We are looking for overlap in redirection frequencies maps between each two peers  Use cosine-similarity between two peers a and b:

Cosine-similarity  Distance(a,b): Sum is over the set of replica servers that a (b) to which the peer has been redirected over the time window  is the ratio of time that peer a has been redirected to replica server i  Cosine-similarity is analogous to dot product When maps are identical, equals 1 When maps are orthogonal (no common replica), equals 0 Values lie in [0,1] Determine a threshold (0.15)

CDNs implementation  Ono, an extension to Azureus client  Upon handshake of two peers, exchange ratio-maps  This enables Ono to perform a biased peer selection Performs DNS lookup for each CDN name to determine redirection behavior and encodes it in ratio-maps Periodically update the ratio-maps  Overhead is extremely small 18KB upstream, 36KB downstream per day Computation of cosine-similarity is easy

Ono-recommended empirical results  Over 120,000 peers use Ono  Ono collects extra network data  Ping, Trace-Route to replica servers and peers  Obtain feedback on the biased peer selection Not easy to determine cross-ISP hops IP hops is easy and gives some measure  Compare Ono-recommended peers selection to random peer selection

Ono-recommended empirical results Cumulative Distribution function of the number of ip hops taken along paths between Ono client and his peers. Each value represents the average number of hops for all peers, seen by a particular Ono client during 6 hour interval Ono finds shorter paths Median in less than half More than 20% are only one hop away, via less than 2%

Ono-recommended empirical results Each ip address was mapped to corresponding Autonomous System id Similar to the previous graph Over 33% of paths found by Ono do not leave the origin AS Median AS hops is one vs. less than 10% in the random case

CDNs as oracles - Summery  Recycling network views collected by CDNs  Good internet citizenship in terms of reducing cross-ISP traffic  Performance of peers is not effected  Scalable (the more clients adopt it, the more accurate the bias would get)  Available easily and freely

Last on the itinerary  One interesting limitation of BitTorrent networks BitTorrent provides poor service availability via analysis of tracker logs over long period  Proposition of peer selection approach Enables to lower costs and resources of ISPs Does not require ISP and peers cooperation Implementation of the proposition as part of Azureus client, codename ‘Ono’  Tradeoffs Performance (avg. Download time) vs. Fairness (avg. share ratio)

BitTorrent Tradeoffs  Taken from the paper “The delicate tradeoffs in BitTorrent-like file sharing protocol design” [3]  Peers that participate in BT are heterogeneous with regard to download and upload capacities  Taking a system approach The system throughput depends critically on the “fat” peers However, this might result in unfairness towards those who contribute more This in turn would encourage peers to supply low upload rate to others

BitTorrent Tradeoffs  A user would look for download it gets to be proportional to the upload it supplies  Assuming peers take the system overview  Long lasting, steady state, rational  Two parameters and their inter relations are explored  Performance: minimum average rate of download time  Fairness: ratio between give and take

Model  W.L.O.G. the file is of size 1  Assume peer average arrival rate of  Assume Peers do not abort  Upon completion, peer leaves the torrent (BT provides no incentive of seeding)  Assume n classes (types) of peers  For each new peer arrival, with probability it belongs to type I  Thus, average arrival rate for class i is  Class i has Ui and Di as upload and download capacities  Assume U1>U2>…>Un (type 1 are the “fat” ones…)

Model  Visualization of the model for n=2

Model – measuring performance  Assume (quite natural) that the bottleneck is the upload capacity  i.e. no network bottleneck such as server saturation  The file uploading capacity of the entire system is  Consider the steady state:  Define as the average number of type i peers  Approximation:  Substitute the later in the former, in s.s.:

Model – measuring performance  In steady state, the capacity should be equal to the arrival rate (as the file size is 1)  Obtain  Define “share-ratio” and rewrite as In a steady and balanced system, share-ratio should be 1  Average system download time  Those two equations define the solution space, as well as the resulting performance  Feasible solutions are the set

Model – measuring fairness  Consider share-ratio as a good and natural measure of fairness  Define fairness index:  Measures how equal the ratios are (if all are the same, it equals 1)  (after some work) Obtain:  Also expressed in terms of upload and download of each class

Rate strategies  General assumption: all peers maximize their upload capacity  Based on experiments  We want optimal average download time T  Solve the constraint problem Use Lagrangian multiplier method

Rate strategies  Optimal average download time T that is obtained is:  We get an assignment for  The system gives the “thin” peers (other than type-1) maximum upload capacity  “Thin” peers get more than they contribute  Calculate the fairness index under this solution (shown to be quite low)

Rate strategies  Now, apply the same strategy to achieve optimal fairness  Then check what is the resulting performance measure  Optimal fairness is achieved when  We get different assignments for  Compare the two:  In terms of system performance, we have:  In terms of system fairness, we have:

Entire design space  Actually those are only two of infinite solutions for assigning  The space lies on a curve  Example: a system of two types with specific capacities

Simulation results  Experiments with two-types system (with the same characteristics as the last system)  Average downloading time:  Fairness:

Simulation results  Fundamental tradeoffs:  Taken for the extreme values of the two strategies:  Summery:  Cannot enjoy both heavens  Current BitTorrnet implementations lie somewhere on the curve

Think about …  With regard to the first paper (service availability), how did the ~8.3 average torrent lifespan was deduced ?  Thank you (those who are still awake…)

Papers  [1] Measurements, Analysis, and Modeling of BitTorrent-like Systems  Lei Guo, Songqing Chen, Zhen Xiao, Enhua Tan, Xiaoning Ding, and Xiaodong Zhang  [2] Taming the Torrent - A Practical Approach to Reducing Cross-ISP Traffic in Peer-to-Peer Systems  David R. Choffnes and Fabián E. Bustamante  [3] The Delicate Tradeoffs in BitTorrent-like FileSharing Protocol Design  Bin Fan, Dah-Ming Chiu, John C.S. Lui