1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo, Ph.D. Candidate Park Graduate Research Award Presentation.

Slides:



Advertisements
Similar presentations
A Measurement Study of Peer-to-Peer File Sharing Systems Presented by Cristina Abad.
Advertisements

Optimal Scheduling in Peer-to-Peer Networks Lee Center Workshop 5/19/06 Mortada Mehyar (with Prof. Steven Low, Netlab)
Peter R. Pietzuch Peer-to-Peer Computing – or how to make your BitTorrent downloads go faster... Peter Pietzuch Large-Scale Distributed.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Rarest First and Choke Algorithms Are Enough
Rarest First and Choke Algorithms are Enough Arnaud LEGOUT INRIA, Sophia Antipolis France G. Urvoy-Keller and P. Michiardi Institut Eurecom France.
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
The BitTorrent protocol A peer-to-peer file sharing protocol.
Incentives Build Robustness in BitTorrent Bram Cohen.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Peer-assisted On-demand Streaming of Stored Media using BitTorrent-like Protocols Authors: Niklas Carlsson & Derek L. Eager Published in: Proc. IFIP/TC6.
Amir Rasti Reza Rejaie Dept. of Computer Science University of Oregon.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Modelling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
A survey of BitTorrent study Jian Liang EL933 Prof. Yong Liu.
An Analysis of Internet Content Delivery Systems Stefan Saroiu, Krishna P. Gommadi, Richard J. Dunn, Steven D. Gribble, and Henry M. Levy Proceedings of.
1 A Framework for Lazy Replication in P2P VoD Bin Cheng 1, Lex Stein 2, Hai Jin 1, Zheng Zhang 2 1 Huazhong University of Science & Technology (HUST) 2.
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Measurement, Modeling, and Analysis of a Peer-to-Peer File sharing Workload Krishna P. Gummadi, Richard J. Dunn, Stefan Saroiu, Steven D. Gribble, Henry.
Modeling and analysis of BitTorrent-like P2P network Fan Bin Oct,1 st,2004.
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
1 Analyzing Patterns of User Content Generation in Online Social Networks Lei Guo, Yahoo! Enhua Tan, Ohio State University Songqing Chen, George Mason.
1 The Stretched Exponential Distribution of Internet Media Access Patterns Lei Yahoo! Inc. Enhua Ohio State University Songqing George.
BitTorrent. BitTorrent network  On the itinerary:  Introduction to BitTorrent  Basics & properties  3 Interesting analysis results.
BitTorrent How it applies to networking. What is BitTorrent P2P file sharing protocol Allows users to distribute large amounts of data without placing.
1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo 1, Songqing Chen 2, Zhen Xiao 3, Enhua Tan 1, Xiaoning Ding 1, and Xiaodong Zhang.
BitTorrent Under a Microscope: Towards Static QoS Provision in Dynamic Peer-to-Peer Networks Tom H. Luan*, Xuemin (Sherman) Shen* and Danny H. K. Tsang.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
DELAYED CHAINING: A PRACTICAL P2P SOLUTION FOR VIDEO-ON-DEMAND Speaker : 童耀民 MA1G Authors: Paris, J.-F.Paris, J.-F. ; Amer, A. Computer.
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
Network Technologies essentials Week 9: Distributed file sharing & multimedia Compilation made by Tim Moors, UNSW Australia Original slides by David Wetherall,
Bit Torrent A good or a bad?. Common methods of transferring files in the internet: Client-Server Model Peer-to-Peer Network.
Resilient Peer-to-Peer Streaming Presented by: Yun Teng.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
MULTI-TORRENT: A PERFORMANCE STUDY Yan Yang, Alix L.H. Chow, Leana Golubchik Internet Multimedia Lab University of Southern California.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
Flashback: A Peer-to-Peer Web Server for Flash Crowds Presented by Tom Batkiewicz CS 587x Fall ‘07.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
1 Why is P2P the Most Effective Way to Deliver Internet Media Content Xiaodong Zhang Ohio State University In collaborations with Lei Guo, Yahoo! Songqing.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
An Analysis of Internet Content Delivery Systems 19 rd November, 2007 Youngsub CSE, SNU.
Does Internet media traffic really follow the Zipf-like distribution? Lei Guo 1, Enhua Tan 1, Songqing Chen 2, Zhen Xiao 3, and Xiaodong Zhang 1 1 Ohio.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
An example of peer-to-peer application
Measurements, Analysis, and Modeling of BitTorrent-like Systems
Introduction to BitTorrent
Controlling the Cost of Reliability in Peer-to-Peer Overlays
Determining the Peer Resource Contributions in a P2P Contract
Small Is Not Always Beautiful
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
Content Distribution Networks + P2P File Sharing
Presentation transcript:

1 Measurements, Analysis, and Modeling of BitTorrent-like Systems Lei Guo, Ph.D. Candidate Park Graduate Research Award Presentation

2 Internet: the Galaxy of Information Systems Huge amount of information produced everyday Numerous computer and communication systems on the Internet –Web servers and proxies –Streaming media systems –Peer-to-peer systems –Content delivery networks, … How the Internet traffic and systems interact and evolve? What are the Kepler's laws for the Internet?

3 My Research: Two “Solar Systems” in the Internet Galaxy Internet multimedia systems –QoS: jitter, startup latency, traffic reduction, … –Infrastructure: proxies, CDNs/MDNs, … –New delivery technologies: MBR encoding, rate adaptation, fast streaming … Peer-to-peer systems –Infrastructure and protocols: structured/unstructured –Content search and file sharing –P2P based VoIP –BitTorrent systems

4 BitTorrent Peer-to-Peer Systems Measurement Analysis Modeling

5 Basic Model of P2P File Sharing Peers sharing different files self- organize into a P2P network Exchange files they desire Limitations –Free riding –Large file downloading ♫ ♫ Examples: Gnutella, KaZaa, eDonkey/eMule/Overnet

6 BitTorrent: Fast Delivery with Incentive A large file is divided into chunks Peers interested in the same file self-organize into a torrent Peers exchange file chunks with each other Incentive is established by tit for tat Very simple and effective, scale fairly well during flash crowd 45 Torrent of Bits...

BitTorrent Traffic Online users –6.8 million in August 2004, 9.6 million in August 2005 (BigChampagne) Traffic volume –53% of all P2P traffic on the Internet in June 2004 (CacheLogic) P2P traffic: 60-80% Other traffic: 20-30% Source: CacheLogic, 2004

8 Limited Understanding of BitTorrent Existing studies on BitTorrent systems (INFOCOM04, SIGCOMM04) –Unrealistic assumptions in system model: no evolution considered –Single-torrent based: more than 85% BT users join multiple torrents What we are not clear about BitTorrent systems –Service availability –Service stability –Service fairness Our objective of this work –Evolution of single-torrent system, and limitations of BT –Multi-torrent model for inter-torrent relation and collaboration during the entire lifetime

9 Outline BitTorrent mechanism and our methodology Modeling and characterization of single-torrent system Modeling and characterization of multi-torrent system Inter-torrent collaboration Conclusion

10 How BitTorrent Works: Publishing The publisher –Create a meta file –Publish on a Web site –Start the tracker site –Start a BT client as the initial seed Tracker site Web site foo.torrent I am here! peer list foo.torrent announce: tracker URL for bootstrap creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk seed

11 How BitTorrent Works: Downloading The downloader –Download the meta file –Start a BT client, connect to the tracker site –Get peer list from tracker –Get first chunk from other peers (seeds) Tracker site Web site foo.torrent peer list download I am here! peer list seed

12 How BitTorrent Works: Downloading The downloader –Download the meta file –Start a BT client, connect to the tracker site –Get peer list from tracker –Get first chunk from other peers (seeds) –Exchange file chunk with other peers –Download complete: become a new seed Tracker site Web site foo.torrent peer list seed foo.torrent

How BitTorrent Works: Downloading The downloader –Download the meta file –Start a BT client, connect to the tracker site –Get peer list from tracker –Get first chunk from other peers (seeds) –Exchange file chunk with other peers –Download complete: become a new seed –Initial seed leaves Tracker site Web site foo.torrent peer list seed foo.torrent seed Future performance Depends on the arrival and departure of new downloaders and seeds

14 Our Methodology of this Study Measurement –BitTorrent traffic pattern –Meta file downloading and tracker statistics Analysis –BitTorrent user behavior and performance limitations –Curve fitting, parameter estimation and validation of mathematical models Modeling –Torrent evolution and inter-torrent relation –Fluid model, probability model, and graph model

15 Meta File Downloading The first HTTP packets of.torrent file downloading –Cable network: 3,000+ users download 1,000+ different torrent meta files –Server farm: 50 tracker sites host hundreds of torrents What information it contains? –Torrent birth time –Peer arrival time to the torrent (packet capture time of downloading) –About 10 days announce: tracker URL creation date: epoch time of file creation length: file size name: file name piece length: chunk size pieces: SHA1 hash key of each chunk foo.torrent

16 Torrent Statistics on Trackers Professional/dedicated tracker sites –Each may host thousands of torrents at the same time – and collected by University of Massachusetts, Amhersthttp:// –Ex: alluvion -- 1,500 torrents, 550 are fully traced What information it contains? –Torrents: torrent birth time, file size, number of peers/seeds –Peers: request time, downloading/uploading bytes, downloading/uploading bandwidth –Sampled every 0.5 hour for 48 days

17 Outline BitTorrent mechanism and trace collection Modeling and characterization of single-torrent system –The evolution of torrent over time –Limitations of current BitTorrent systems Modeling and characterization of multi-torrent system Inter-torrent collaboration Conclusion

18 Peer Arrival over Time server farm workloadcable network workload raw data linear fit # of.torrent file downloading (CCDF) decrease with time exponentially raw data linear fit time after torrent birth (day): t = peer arrival time – torrent birth time

raw data linear fit Torrent Popularity: Peer Arrival Rate relative deviation (%) tracker statistics workload # of peer requests (CCDF) time after torrent birth (day) torrents ranked by population (non ascending order) 6% in average fitting deviation of individual torrents derivative of CCDF peer arrival rate

20 t Torrent Death peer arrival rate: seed leaving rate: inter-arrival time: seed service time: Peer n arrives at time t n : downloading rate: downloading time: When t n  , what will happen? tntn t n+1 peer npeer n+1 torrent dead inter-arrival time > seed service time

21 Torrent Lifespan Extract  t and t from trace Get 0 and  using linear regression Lifespan model verified by measurement trace model torrent lifespan (hour) torrents torrent lifespan

22 Model verified by measurement Observations: –The population of most torrents are small (102 in average) –Downloading failure ratio –Small population  large R fail Torrent Population Total population trace model torrent population rank of torrents (in non-ascending order of modeling results)

23 Downloading Failure Ratio Average downloading failure ratio: about 10% Different evolution patterns –Small population  large R fail Altruistic peers make torrents long live download failure population downloading failure ratio torrent population torrents ranked in non-ascending order of downloading failure ratio

24 Torrent Evolution: Fluid Model Existing model (SIGCOMM 04) –Constant arrival rate = const –Torrent reaches equilibrium The correct model –Exponentially decreasing arrival rate –Torrent dead finally –Verified by our measurements Two completely different pictures

25 Torrent Evolution: Modeling Results Flash crowd –Downloader #: exponentially  –Seed #: exponentially  Peek time –A very short duration –Constant arrival model: flat peak Attenuation – a long tail –Downloader #: exponentially  –Seed #: exponentially  –Constant arrival model is far from the reality: no attenuation Torrent death trace model trace model time (hour) # of seeds # of downloaders constant arrival model

26 Performance Variation: Evolution model trace time (hour) avg download speed (byte/sec)  10 4 Only stable when population is large Fluctuate significantly after peak time Cannot be smoothed in scale of typical downloading time  time (hour) avg download speed (byte/sec) instant speed avg speed (over typical download time)

27 Performance Variation: Distribution Larger torrents have higher and more table performance time (day)  10 3 downloader seed download speed Avg performance of all torrents over time avg download speed (byte/sec) torrents 10 1  downloader seed download speed Snapshot of torrents at time t # of peers Average downloading speed over all torrents is much more stable

peer contribution ratio # of torrents + contribution ratio –x– # of torrents ranked peers Service Unfairness Unfairness:  download speed,  uploading contribution Seeds serve high speed downloaders first –Peers not willing to serve after downloading –Not due to new file downloading: selfish ranked peers download speed (byteps) peer contribution ratio + contribution ratio –x– download speed Contribution ratio: uploaded bytes downloaded bytes

29 Single-torrent Model: Summary Torrent evolution over time –Exponentially decreasing arrival rate –Flash crowd – short peak – long tailed attenuation BitTorrent Limitations –Content availability: torrent death –Performance stability –Service fairness

30 Outline BitTorrent mechanism and trace collection Modeling and characterization of single-torrent system Modeling and characterization of multi-torrent system –Traffic pattern and user behavior –Graph based model of inter-torrent relation Inter-torrent collaboration Conclusion

31 Modeling Multi-torrent System Considering peers and torrents on the Internet as an open system –Torrent birth and death –Peer birth and death –Participation of multiple torrents The lifecycle of a BitTorrent peer –Downloading, seeding, sleeping, … A huge number of torrents on the Internet –independent in current BT model, but –inherently related by peers joining multiple torrents Our model assumption –Assume each peer joins one torrent at a time, and at most once Current BT encourage a peer to exchange chunks with peers in the same torrent

Dynamics in Multi-torrent Environment Torrent birthRequest arrivalPeer birth Torrent birth time, request arrival time, and peer birth time (hour) CDF of torrents CDF of requests CDF of peers Torrent birth rate t : per hour Torrent request rate q : per hour (for all peers over all torrents) Peer birth rate p : per hour Implication: raw data linear fit raw data linear fit raw data asymptotic fit avg # of requested torrents by a peer torrent request rate peer birth rate = = constant

33 Peer Request Pattern: Request Rate Peer request rate: requests by a peer to different torrents per unit time  r  77 years ! Assume Peer request process: Poisson-like, with constant avg request rate Request new a torrent with a probability p: participation probability –x– # torrents +  r  r (day) # of torrents peers

34 Peer Request Pattern: Participation Probability peer rank ( log i ) number of torrents ( m ) trace collection duration born beforerequest after ? peers request at least m torrents peer born

35 trace collection duration born beforerequest after Peer Request Pattern: Participation Probability ––– raw data ––– linear fit number of torrents ( m ) peer rank ( log i ) p = peers request at least m torrents peer born

36 Peer Request Pattern: Participation Probability Another estimation of p trace collection duration born beforerequest after peer born ––– raw data ––– linear fit number of torrents ( m ) peer rank ( log i ) p = peers request at least m torrents Probability model confirmed

37 Inter-torrent Relation Graph: How Torrents Can Help with Each Other? ji ji 1 some peers in torrent i have downloaded j 2 some peers in torrent j have downloaded i

38 Inter-torrent Relation Graph: How Torrents Can Help with Each Other? Edge weight W i,j : number of such peers ji ji 1 some peers in torrent i have downloaded j 2 some peers in torrent j have downloaded i torrents weighted out-degree weighted in-degree torrent size (# of online peers) trace model trace model torr size torrents

39 Single-torrent vs. Multi-torrent Model Single-torrent model –  seed service time,  download failure rate –Limited seed service time , but inter-arrival time  exponentially –Small improvement Multi-torrent model –Old peers come back multiple times –  peer arrival rate,  peer inter-arrival time –Significant improvement

40 Single-torrent vs. Multi-torrent Model Multi-torrent modelSingle-torrent model Inter-torrent collaboration is much more effective than stimulating seeds to serve longer  ≈ 0 seeds stay 10 times longer:  * =  /10 torrent death ' (T' life ) =  0.1

41 Outline BitTorrent mechanism and trace collection Modeling and characterization of single-torrent system Modeling and characterization of multi-torrent system Inter-torrent collaboration –Tracker site overlay –Instant incentive for collaboration Conclusion

42 Tracker Site Overlay Self-organized P2P network (a logical structure) An instance of inter-torrent relation graph A built-in mechanism for content search, cover 99%+ torrents A C B D Neighbor-in Neighbor-out BCBC D torrents that can serve me torrents that I can serve (peer list)

43 Tracker Site Overlay Table size Node degree distribution –Similar to unstructured P2P networks Many content search and msg routing algorithms –Flooding –Random walk –… Trackerless BitTorrent: uses DHT to store meta file

44 Jack file A file D Incentive for Inter-Torrent Collaboration C B D Instant incentive – similar to “tit-for-tat” principle –Neighboring cycle detection –Bandwidth trading: neighboring cycle construction –get one chunk, serve multiple peers A Thanks Jack! Tom

45 Simulation Experiments performance stability content availability service fairness without inter-collaboration with inter-collaboration R fail  0 more balancedmore stable Inter-torrent collaboration can improve BitTorrent performance downloading speeddownloading failure ratiocontribution ratio

46 Conclusion Extensive analysis and modeling to study the behaviors of BT-like systems –Tracker trace and.torrent downloading trace –Mathematical model BitTorrent system has its limitations due to exponentially decreasing peer arrival rate –Service availability, performance stability, and fairness Graph based multi-torrent model System design for inter-torrent collaboration

47 Acknowledgements Dr. Xiaodong Zhang, my advisor Mr. Joe Anselmo, College of William and Mary Dr. William Bynum, College of William and Mary Dr. Songqing Chen, George Mason University Mr. Xiaoning Ding, College of William and Mary Dr. Song Jiang, Los Alamos National Lab Dr. Teresa Long, College of William and Mary Mr. Shansi Ren, College of William and Mary Mr. Enhua Tan, College of William and Mary Dr. Li Xiao, Michigan State University Dr. Zhen Xiao, AT & T Labs – Research

48 Selected Publications Measurements, Analysis, and Modeling of BitTorrent-like Systems, IMC 2005 Analysis of Multimedia Workloads with Implications for Internet Streaming, WWW 2005 Fast and Low Cost Search Schemes by Exploiting Localities in P2P Networks, JPDC 2005 DISC: Dynamic Interleaved Segment Caching for Interactive Streaming, ICDCS 2005 Exploiting Content Localities for Efficient Search in P2P Systems, DISC 2004 PROP: a Scalable and Reliable P2P Assisted Proxy Streaming System, ICDCS 2004

49 Thank you!