RES 203 Internet applications

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

Optimal Scheduling in Peer-to-Peer Networks Lee Center Workshop 5/19/06 Mortada Mehyar (with Prof. Steven Low, Netlab)
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Rarest First and Choke Algorithms Are Enough
Rarest First and Choke Algorithms are Enough Arnaud LEGOUT INRIA, Sophia Antipolis France G. Urvoy-Keller and P. Michiardi Institut Eurecom France.
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
Incentives Build Robustness in BitTorrent Bram Cohen.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Clustering and Sharing Incentives in BitTorrent Systems Arnaud Legout 1, Nikitas Liogkas 2, Eddie Kohler 2, Lixia Zhang 2 1 INRIA, Projet Planète, Sophia.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Analyzing and Improving BitTorrent Ashwin R. Bharambe ( Carnegie Mellon University ) Cormac Herley ( Microsoft Research, Redmond ) Venkat Padmanabhan (
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Alex Sherman Jason Nieh Cliff Stein.  Lack of fairness in bandwidth allocation in P2P systems:  Users are not incentivized to contributed bandwidth.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
P2P File Sharing Systems
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
Do incentives build robustness in BitTorrent? Michael Piatek, Tomas Isdal, Thomas Anderson, Arvind Krishnamurthy, Arun Venkataramani.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
MULTI-TORRENT: A PERFORMANCE STUDY Yan Yang, Alix L.H. Chow, Leana Golubchik Internet Multimedia Lab University of Southern California.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Analyzing and Improving BitTorrent Ashwin R. Bharambe ( Carnegie Mellon University ) Cormac Herley ( Microsoft Research, Redmond ) Venkat Padmanabhan (
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
Malugo – a scalable peer-to-peer storage system..
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Brocade: Landmark Routing on Overlay Networks
William Stallings Data and Computer Communications
Lecture XV: Real P2P Systems
Coral: A Peer-to-peer Content Distribution Network
An example of peer-to-peer application
FairTorrent: BrinGing Fairness to Peer-to-Peer Systems
Introduction to BitTorrent
BitTorrent Vs Gnutella.
CHAPTER 3 Architectures for Distributed Systems
Peer-to-Peer and Social Networks
Internet Networking recitation #12
nTorrent: Peer-to-Peer File Sharing in Named Data Networking
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
DHT Routing Geometries and Chord
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
CS 162: P2P Networks Computer Science Division
Part 4: Peer to Peer - P2P Applications
Small Is Not Always Beautiful
Simplified Explanation of “Do incentives build robustness in BitTorrent?” By James Hoover.
The BitTorrent Protocol
Chapter 2 Application Layer
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
#02 Peer to Peer Networking
Presentation transcript:

RES 203 Internet applications dario.rossi Peer-to-peer Dario Rossi http://www.enst.fr/~drossi RES203 V01/2017

Agenda Introduction on P2P Finding content (today) Diffusing content Recap on client-server vs P2P paradigms Interest of P2P paradigm P2P networks and Overlay graphs (vanilla) taxonomy of P2P applications Finding content (today) Napster, Gnutella, DHTs (Chord) Diffusing content BitTorrent, P2P-TV Transmission strategies: Skype and BitTorrent congestion control

Client-server paradigm Runs on end-hosts On/off behavior Service consumer Issue requests Do not communicate directly among them Need to know the server address Server: Runs on end-hosts Always on Service provider Receive services Satisfy requests from many clients Need a fixed address (or DNS name)‏

Client-server paradigm 3 Server S distributes content to clients 1..N 2 1 S

Client-server paradigm 3 Server has to provide all the necessary upload bandwitdth 2 1 S

Peer-to-peer paradigm Runs on end-hosts On/off behavior, handle churn‏ Need to join Need to discover other peers Service providers and consumers Communicate directly among them Need to define communication rules Prevent free riding Incentivate participation and reciprocation , Servers used for bootstrap S X E A C D ? B X S ? B A D E

Peer-to-peer paradigm N-1 N 3 Peers may assist S using their upload bandwidth 2 1 S

Peer-to-peer paradigm N-1 N S 3 Notice that: Servers are typically needed for bootstrap Servers aren’t needed for resource sharing Peers-2-peer network do not necessarily need a server 2 1

Client-server vs Peer-2-peer Interest of P2P S N-1 2 N 1 3 S N-1 2 N 1 3 What is the minimum download time under either paradigm ?

Client-server F-bits long file 1 server Us server upload rate N clients Di download rate of the i-th client Dmin = min( Di ) , slowest client Ti download time of the i-th client T = max( Ti ), system completion time Assuming simple fluid model, server policy is to give each of the N peers an equal share of its rate Us/N, what is T equal to ? S N-1 2 N 1 3 (1) T >= F / (Us/N) = NF / Us i.e., the download cannot be faster than the share of the server upload capacity allows T >= F / Dmin i.e., the download cannot be faster than the downlink capacity of the slowest peer allows T >= F/min( Us/N, Dmin )

Peer-2-peer F-bits longffile 1 source peer (having the content at time t=0) Us source peer upload rate N sink peers Ui,Di upload & download rate of the i-th peer Dmin = min( Di ) , slowest peer Ti download time of the i-th client T = max( Ti ), system completion time Assuming simple fluid model, source gives bits to peers at rate Si, each peer replicate received data toward the others N-1 peers at rate <= Si S N-1 2 N 1 3 (1) T >= F / Us i.e., no peer can receive faster than the source can send (2) T >= NF /( Us + SUi ) i.e., the overall data cannot be downloaded faster than the aggregated system capacity (with peers) allows T >= F / Dmin i.e., the download cannot be faster than the downlink capacity of the slowest peer allows T = F/min(Us, (Us + SUi)/N, Dmin)

Client-server vs Peer-2-peer Interest of P2P (in theory) P2P protocols can offload server capacity, allowing sub-linear (e.g., logarithmic) scaling (in principle) Performance example with file diffusion, conclusion hold for many services (today: DHT lookup, BitTorrent diffusion) 800 File size F=10MB, Peer upload Ui=500 Kbps, Source upload Us=10 Mbps Peer download Di >> Ui peer-2-peer client-server 700 600 500 File diffusion time Tmin (sec) TCS = F / (Us/N) = NF/Us TP2P = F / [(Us + SUi)/N] = NF / (Us + SUi) Best case performance 400 300 SUi 200 100 10 20 30 40 50 60 70 80 90 100 Number of peers/clients N

Client-server vs Peer-2-peer Performance of P2P (in practice) Affected by protocol decisions and implementations (e.g., ability to saturate capacity at L4, to make useful decisions at L7, etc.) Affected by peer behavior (e.g., free riders, peers not contributing or contributing less than their fair share) 800 peer-2-peer 0<h<1: protocol efficiency 0<a<1: free riders client-server 700 600 500 h 0 File diffusion time Tmin (sec) TP2P = F / [(Us + h(1-a)SUi)/N] = NF / (Us + h(1-a)SUi) More realistic performance 400 a1 300 SUi 200 100 10 20 30 40 50 60 70 80 90 100 Number of peers/clients N

P2P Overlays P2P network is a graph, where edges represent logical connections between peers Logical connection: Ongoing communication via TCP , UDP sockets Reachability with application layer routing tables

P2P Overlays P2P networks are commonly called Overlay as they are laid over the IP infrastructure Notice that: Not all logical links are also physical Not all physical links are used A B C Application Overlay IPa IPc IPb IP Network

P2P Overlays P2P overlay graphs are very much different, and depend on implementation choices App1 App2

P2P Overlays Peers Challenges Come and go due to their will or due to failures (l) Have/have not the resources you are looking for (h) May/may not be willing to give the resource (a) Challenges Be resilient face to churn, failures (l) Effectively locate and share resources (h) Scale to possibly several million users (h) Incentivate peers participation to the system (a)

P2P Taxonomy P2P software spans a fair range of services Structure File-sharing and content distribution Voice/video call Television/VoD CPU sharing Indexing and search Structure Structured Unstructured Hierarchical Structure is not tied to a particular service Filesharing BitTorrent eDonkey SopCast TVAnts PPLive Joost Live TV / VoD VoIP/Chat Skype Gtalk Search Kademlia RapidUpdate DebTorrent Apt-p2p,… SW Updates

P2P Taxonomy Unstructured Hierarchical Structured Arbitrarily connected (akin to random graphs) One level of hierarchy (super-peers) Routing exploits regularity Distributed Hash Tables used by Skype, BitTorrent, eDonkey, ... Gnutella (up to v0.4) BitTorrent, P2P-TV Gnutella (v0.6 onward) Skype and eDonkey (initially)

P2P in this course Lookup and routing Diffusion and scheduling Find the peers having the resource of interest (e.g., a file or a contact for messaging/calls) Diffusion and scheduling Ensure timely and effectively diffusion of a resource (e.g., a file, or a TV stream). Transport and congestion control Interactive, near real time applications (Skype) Background applications (e.g., BitTorrent)

P2P Lookup (DHTs)

P2P Lookup Lookup strategies Note Centralized indexing (Napster) Query via random walk Query via flooding (Gnutella) Distributed Hash Tables (Chord) Note For the time being, we do not deal with the problem of what to do after the resource is located on the overlay

Lookup via Centralized Indexing Centralized server keeps list L of peers ressources Peers keep the ressources R notify ressources to server at startup anytime a new ressource is added or deleted L R

Lookup via Centralized Indexing Example (1/2) peer A search for ressources R stored at peer B A ask server S location of R S reply to A with address of B A contact B and fetches R A notifies S he owns R S updates ressources list 1:R? 2:B! 5:R! A 3:R? B 4:R

Lookup via Centralized Indexing Example (2/2)‏ peer C search for ressources R now stored at peers A, B C ask server S location of R S reply to C with addresses A, B C selects best peer among A,B probes delay, bandwidth, ... C gets R, then notifies S, which updates list L 1:R? 2:A,B A C B probe

Lookup via Centralized Indexing Pros Simple architecture Only limited traffic to server (query)‏ Peer-centric selection of best candidates Cons Central database not scalable Single point of failure Single legal entity (Napster was actually shutdown)‏ A C B

Flooding on Unstructured Overlay Peers each peer only stores its own content Lookup Queries forwarded to all neighbors Flooding can be dangerous generate a traffic storm on the overlay Need to limit flooding drop duplicated query for the same element constrain flooding depth (application level Time To Live field)‏

Flooding on Unstructured Overlay Example A look for R, stored at D, F sets TTL=4, send query to B,C B,C haven’t the ressource R forward query to D, E respectively D has the ressource Route the query back on the overlay E hasn't ressource, Forward to G, H drops: TTL exceeded, query not reach F H G B R D C R? A E F R

Flooding on Unstructured Overlay Example A look for R, stored at D, F sets TTL=4, send query to B,C B,C haven’t the ressource R forward query to D, E respectively D has the ressource Route the query back on the overlay E hasn't ressource, Forward to G, H drops: TTL exceeded, query not reach F Hint to heuristic solution: expanding TTL rings H G B R Voilà R! D C A E R, please F R

Flooding on Unstructured Overlay Pros Query lookup is greedy Simple maintenance Resilience to fault due to flooding Cons Tradeoff scalability/accuracy Either flooding sends (too) many messages Or lookup may fail (due to TTL)

Flooding on Hierarchical Overlay Peers Each peer only stores its own content Super-peers Indexes the content of the peers attached to them Lookup Peers contact their super-peers Flooding is restricted to super-peers

Flooding on Hierarchical Overlay Pros Lookup is still greedy and simple Efficient ressource consumption Hierarchy increases scalability Cons Increased application complexity Less resilient to super-peers churn Need to carefully select super-peers

Lookup on Structured Overlay Peers Arranged on very specific topologies Implement topology-dependant lookup to exploit overlay regularity Indexing Peers and ressources are indexed Structured overlay implement a Hash semantic Offer insertion and retrieval of keys For this reason, are called Distributed Hash Tables

Lookup on Structured Overlay Indexing idea take an arbitrary key space; e.g., real in [0,1] take an arbitrary map function e.g., colorof(x) map peers to colors lpeer = colorof(peer) map ressources to colors lressource = colorof(ressource) assign ressources to closest peer (in terms of l) ressource peer peer

Lookup on Structured Overlay Indexing in practice (e.g. the Chord DHT) uses consistent hashing Secure Hash SHA1 (B bits) ID=SHA1(x) in [0,2^B-1] Peers are assigned peerID = SHA1(peer) Ressources are assigned fileID = SHA1(file) File is inserted at the peer whose peerID is closer to (but not greater than) fileID Peers are responsible for a portion of the ring P2m-1 P1 P8 P14 F10 F15 F38 F50 P32 P56 P38 P51 P48

Lookup on Structured Overlay Lookup for a fileID Simplest lookup possible Every node uses its successor Successor pointer needed for lookup correctness Highly inefficient Linear scan of the ring Longer-reach pointers can be used to improve lookup efficiency P2m-1 P1 P8 P14 P8.Lookup(F56) P21 successor P32 P38 P51 P42 P48

Lookup on Structured Overlay Keep a list of log(N) ``fingers'' pointers to peers at exponential ID space distance 1st finger is at peerID + 2^1 2nd finger is at peerID + 2^2 k-th finger is at peerID + 2^k if peerID + 2^k does not exist, take closest following peer P2m-1 P1 P8 successor P14 P21 P32 P56 P38 P51 P42 P48

Lookup on Structured Overlay fingers Keep a list of log(N) ``fingers'' pointers to peers at exponential ID space distance 1st finger is at peerID + 2^1 2nd finger is at peerID + 2^2 k-th finger is at peerID + 2^k if peerID + 2^k does not exist, take closest following peer P2m-1 P1 P8 k finger 1 +1 N14 2 +2 3 +4 4 +8 N21 5 +16 N32 6 +32 N42 +1 +2 +4 P14 +8 P21 +16 P56 P32 +32 P51 P38 P42 P48

Lookup on Structured Overlay Lookup for a fileID Idea: make as much progress as possible at each hop Greedy algorithm so, choose the finger whose ID is closest to (but strictly <) fileID next hops do the same intuitively, distance from fileID halves at each step (dichotomic search) consequently, mean lookup length is logarithmic in the number of nodes lookup can be recursive (as in the picture) or iterative (recall DNS) P2m-1 P1 P8 P14 P8.Lookup(F56) P21 P56 P32 P51 P38 P42 P48

Lookup on Structured Overlay Bad protocol s space Pros ``Flat'' key semantic Highly scalable Tunable performance State vs lookup length tradeoff Cons Much more complex to implement Structure need maintenance (churn, join, refresh) Difficult to make complex queries (e.g., wildcards, or sub/ubs/str/tri/rin/ing/ngs) Optimal line Unfeasible region

References Optional readings: [1] Lua, E.K., Crowcroft, J., Pias, M., Sharma, R. and Lim, S., A survey and comparison of peer-to-peer overlay network schemes . IEEE Communications Surveys and Tutorials, 7(2):72-93, 2005. [2] Stoica, I., Morris, R., Karger, D., Kaashoek, M.F. and Balakrishnan, H., Chord: A scalable peer-to-peer lookup service for internet applications . ACM SIGCOMM, 31(4):149-160, 2001. For your thirst of knowledge: [3] Zave, Pamela, Using lightweight modeling to understand chord . ACM SIGCOMM Comput. Commun. Rev., 42(2):49-57. 2012 [4] Rowstron, Antony, and Peter Druschel. "Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems." Middleware 2001. Springer Berlin/Heidelberg, 2001. [5] Ratnasamy, S., Francis, P., Handley, M., Karp, R., and Shenker, S. A scalable content-addressable network ACM SIGCOMM, 31(4):161-172, 2001. [6] Malkhi, Dahlia, Moni Naor, and David Ratajczak. "Viceroy: A scalable and dynamic emulation of the butterfly." Proceedings of the twenty-first annual symposium on Principles of distributed computing. ACM, 2002. [7] Maymounkov, Petar, and David Mazieres. "Kademlia: A peer-to-peer information system based on the xor metric." Usenix IPTPS, p. 53-65, 2002 [8] D. Loguinov, A Kumar, V Rai, S Ganesh "Graph-theoretic analysis of structured peer-to-peer systems: routing distances and fault resilience." ACM SIGCOMM, 2003 [9] F. Kaashoek and D.R. Karger, ``Koorde: A Simple Degree-Optimal Hash Table’’, Usenix IPTPS, Feb. 2003.

P2P Diffusion (BitTorrent)

43 Plan BitTorrent ecosystem overview Swarming algorithm(s) References Interest for multipart download Piece selection – Local rarest first & co Peer selection – Tit for tat & co Performance overview References 43

BitTorrent: Overview Seed Leecher File pieces Piece transmission Messages over: TCP (old) UDP/LEDBAT (new) Tracker

Ecosystem: Overview Source: [3]

Ecosystem: Overview Resource localisation Peer localisation Torrent files Torrent discovery sites Peer localisation Trackers DHTs Gossiping Resource diffusion BitTorrent swarming protocol Focus

Swarming Multipart download Benefits of parallel downloads Benefits of splitting content in chunks

Swarming: Chunk-based transfer Service capacity Number of peers that can serve a content It is 1 for client-server, constant with time Flash crowd of n clients/peers Simultaneous request of n peers (e.g., soccer match, patch, etc.) Piece (a.k.a. chunk, block) A content is split in m pieces Each piece can be independently downloaded

Swarming: Chunk-based transfer Content based transfer P2P Service capacity grows exponentially with time mean P2P download time is O(log(n)) vs client-sever O(n) (see previous model, or [4]) Chunk-based transfer Content is divided into m chunks, k parallel downloads Mean download time decreases by 1/m, to O(log(n)/m) (see reading [4]) Negligible overhead, efficiency reduced by O((log(m)/m)k) (see reading [5]) k=4 k=2 m

Swarming: Chunk-based transfer Limits of the models [4,5] Ideal scenario Homogeneous capacity (h=1) No transport layer effects (h=1) No network bottleneck (h=1) No free riding (a=0) No churn (l=0) Global knowledge (h=1) Ideal peer selection Each peer always knows to which peer P to send the content at a given time (h=1) Ideal piece selection A peer is never blocked because it does not have the right piece to upload to others (h=1) In practice Simple (but very telling) models of ideal swarming performance P2P is very efficient when there is peer to send useful pieces to Optimistic benchmarks significant algorithmic tuning needed to approach the bound (h1) Careful interpretation Real-world P2P performance can easily fall apart from ideal ones (h<<1)

Swarming: Parallel downloads Dynamic parallel download Introduced to speed up Web caches in the early 2000 [6] Idea: download different pieces from multiple k servers Start requests for piece { P1, .., Pk } to all available severs { S1, .., Sk } Each time a piece Pj is received from Si, request to Si a new piece (=previously not requested to any other sever) Two performance issues Interblock idle time, solved by pipelining Termination idle time, solved by “endgame” mode (BitTorrent terminology) S1 C S2 P1 P2 P1 P3 P2 P3 P4 P5

Swarming: Parallel downloads Inter-block idle time Time to receive a new request after sending the last byte of a piece Problem 1RTT idle time Server underutilization (h<1) Solution Pipelining (recall HTTP/1) Two flavors Bufferless Buffering S1 C S2 P1 P2 P1 Idle time P3 P2 P3 Idle time P4 P5

Swarming: Parallel downloads (Bufferless) pipelining Ensure next request hits the server before the current request is completely served Wait for start of chunk reception before sending new request Advantage Avoid buffering of user requests at server Complexity lies at requester (request scheduling) More flexible scheduling at client (requests scheduled just-in-time) Disadvantage Assume knowledge of RTT Assume chunk tx time > RTT Underutilization still possible in case of misestimation S1 C S2 P1 P2 P1 Idle time P3 P5 RTT < P3 P3 No idle time P4 P5

Swarming: Parallel downloads (Buffered) pipelining Ensure server always has a buffer of requests Sending a stack of request, then new request at each chunk reception Advantage Efficient use of resources No need for precise RTT estimate provided target number of request is large Disadvantage Complexity lies at sender (buffering of requests) Lack of flexibility* (when already requested resource becomes available elsewhere) S1 C S2 # of pipelined requests at S1 P1,3,5 P2 3 2 P1 P6 1 P3 2 P7 1 P5 2 P8 1 P6 2 P4 … P5 * Flexibility at the price of complexity (e.g cancel pending requests) and potential problems (e.g., imagine bidirectional exchanges in band with request signaling)

Parallel download in P2P Parallel download in CS Clients and servers are decoupled Small number of parallel downloads Parallel download in P2P Every peer is client and server (coupling) Large peer set (degree of parallelism)

Parallel download in P2P Naive solution Every peer connects to every other peer Unfeasible Not scalable, too large number of connections Inefficient Per-peer throughput scales as Ui/N Unpractical Some peers not altruist (but global Ui/N share incentives free-riding) Practical solution Articulated as piece/peer selection Need to deal with free-riding

Parallel download in P2P Ideal solution Forget free-riding for the time being (assume nice ideal altruistic world) Every peer connects to a limited number d of peers Generally studied in terms of parallel upload (mirror view, equivalent problem) Tradeoff Number of peers served (d) versus per-peer service rate (1/d) Increasing d adversely impact the system for static neighborhood [4] Even in dynamic swarms, marginal gain for d>4, suggests d<10 [4] BitTorrent mainline: if upload > 42 kB/s, uploads = int(math.sqrt(rate * .6)) Practical solution Articulated as piece/peer selection Need to deal with free-riding

Parallel download in P2P Swarm Peer set Active peer set L7 routing table infos L4 ongoing transfers 10s-10,000s of peers (stats in [3]) ~40-80 peers ~4-10 peers

Parallel download in P2P Gnutella eMule/eDonkey BitTorrent Aim Content lookup Content diffusion Resource Lookup Flooding on unstructured (v0.4) or hierarchical (v0.6) overlay Flooding on hierarchical network or DHTs Out-of-band (DHTs, trackers, peer-exchange…) Piece selection File splitting not in v0.6 (later addition) Rarest first piece (+ other criteria) Local rarest first Peer selection Order of answer Age in priority queue * credit modifier (U/D ratio) Choke algorithm on short term bandwidth Diffusion Poor performance Not thoroughly studied Slow reactivity Good performance Thoroughly studied Free riding Not dealt with Still possible Mostly unlikely Local rarest first Choke algorithm on short term bandwidth In what follows focus on BitTorrent

Importance of peer/piece selection Scalability results [4,5] assume the following Always find a peer to which to upload an useful piece Never idle, 100% efficient use of uplink capacity Peers never refuse to upload a piece No free riders If any of the above assumptions is relaxed System models no longer apply Impact (worsening) on BitTorrent performance ? Parallel download no longer applies (selfishness) ! Peer selection goals Always find peers to upload to Prevent free riding Piece selection goals Always have useful data to send to others Piece diversity is called entropy BitTorrent peer selection aim Converge to the best upload-download match BitTorrent piece selection aim Maximize piece entropy of the swarm Peer/piece selection order Peer selection first Idea: capacity more important for diffusion performance

Piece selection policies Random piece selection Naive, but simple to implement Poor entropy [1] Global rarest first Select the globally rarest piece Each peer need to know the number of copies of every piece in the swarm Good entropy [1] BitTorrent Additional policies for corner cases strict priority, random first piece endgame mode, super-seeding Local rarest first Approximates global rarest first Select the piece that is the rarest among the immediate neighbors Tradeoffs accuracy vs communication cost and complexity Good entropy on random graph (BitTorrent swarm) with a large degree (peer set) [1] Peer selection is performed first, selected piece will be the rarest in the active peer set

Piece selection: Local Rarest First Aim Minimize overlaps between buffermaps Interest Equalize resource availability (maximize entropy) Avoid all peers interested in the same content (no piece hotspot) Let peer selection efficiently exploit the available capacity All peer have always something of interest (avoid idle time) Good Bad

Piece selection: Strict priority Aim Pieces (256KB-10MB) are divided in sub-pieces (16KB) Useful to rapidly obtain complete pieces Policy If request(s) for a piece has been sent Requests for the other sub-pieces are sent out before a new piece request can be started Interest Sub-pieces are useful for pipelining (efficient transfers) Only pieces have a signature for data verification Only pieces can be shared, sub-pieces cannot (more details later)

Piece selection: Random First Piece Aim Assisting new leechers to obtain their first piece Policy While a complete piece is not obtained yet Send requests for random pieces Exception to Local Rarest First Interest When new leechers have no piece, they cannot upload to other peers, and must wait for OU Let new leechers exploit OU and RU

Piece selection: Endgame mode Aim Assist leecher in becoming seeds Policy Toward the end of the download, allow exception to Rarest First, as the needed chunks may not be rare Send request for all missing pieces to all active neighbors Interest Having more seed in the swarm Ensure perennity of the swarm

Peer selection policies BitTorrent Tit-for-tat (TFT) terminology Choke/Unchoke A chokes B if stops uploading A unchokes B if uploads to B Interested/not interested A is interested in B if B has ≥1 piece that A does not have A is not interested in B if B piece set is a subset of A set TFT algorithm family Choke /unchoke Seed state, Leacher state Additional algorithm Anti-snubbing Optimistic unchoking Choke metrics Capacity primary metric for a diffusion service Potentially based on several peer properties Capacity, delay, AS, … or combination of the above

Peer selection: Choke LS vs SS Choke leecher state (LS) Leecher : partial copy Regular unchoke (RU) Every 10 seconds, sort peers according to their upload rate to the local peer Keep the 4 fastest uploaders Optimistic unchoking (OU) Every 30 seconds, choke 1 active peer and unchoke another at random For the local peer, OU may let him discover new fast peers with whom to reciprocate For the remote peer, OU may provide some piece to new peers having just joined the swarm For the swarm, OU avoid that peers exchange in small groups No more than 4 peers are unchoked at the same time Choke seed state (SS) Seed : complete copy Favor Upload (FU) Same as in LS, but sorting criteria is fastest donwloader Best for swarm, since seed already has a complete copy Round Robin (RR) Every 10 seconds the interested peers are ordered according to the time they were last unchoked For two consecutive 10s periods, the oldest 3 peers are unchoked + an OU as 4th peer For the third 10s period, the olderst 4 peers are unchoked FU may favor fast free-riders, RR equally favor all peers

Peer selection: Choke LS vs SS Choke leecher state (LS) Choke seed state (SS) 10s Time (10 seconds slots) OU RU 1 2 3 4 5 6 7 8 9 10 Peer ID 11 10s Peer ID (oldest first) OU RU 1 2 3 4 5 6 7 8 9 10 12 60 seconds Time (10 seconds slots)

Peer selection: Anti-snubbing Aim Avoit snubbing problems (inverse of free-riding) Policy If we upload to a peer A that chokes us for >60s, choke it Wait for OU from A before unchoking A (resume upload) Interest Snub = upload without download; may happen when the upload rate toward A is lower than the 4 best uploaders for A To avoid snubbing, force conditions for OU from A by stopping the upload

BitTorrent performance Very popular, successful: empirically What exactly makes it perform so well? Which parameter is crucial? Performance depend on many parameters, hard to model them all analytically [4,5] Answers via simulation [1] and experiments [2] Some performance issues Are download rates optimal? Under which conditions ? [1] Is Local Rarest First policy good enough? [1] Does rate-based Tit-for-tat (TFT) work? [2] Effectiveness of sharing incentives ? [2] Protocol packetization and signaling overhead Resilience to free riding [7,8,9,10] (h) (a)

Experiments [2] Scenarios Aim Deploy BitTorrent on PlanetLab Instrumented BitTorrent client (mainline 4.0.2; 2nd most downloaded BT client at SourceForge 51M in 2009) Instrumented client logs control messages, state machine transitions , algorithm internals, bw estimates, … Scenarios 1 single initial seed always connected for the duration of the experiment 40 leechers join at the same time (flash crowd) New leechers leave as soon as they becom seed Content: ~100MB file, 400 pieces of 250KB each Three classes of peers (slow/medium/fast) Aim TFT perf. for slow vs fast seed Main metric: clustering of exchanges with peers of the same class (TFT reciprocation)

Experiments [2] Clustering index Scenarios Aim S XC RU(P,X) CI(P,C) = World class W = Cfast U Cmedium U Cslow Clustering index (CI) of peer P in class C S XC RU(P,X) CI(P,C) = S YW RU(P,Y) Interpretation Ratio of Regular Unchoke (RU) duration CI(P,C)=1 if P unchoked only C peers CI(P,C)=0 if P unchoked only !C peers CI(P,C)=0.3 if P unchoked peers uniformly at random (over 3 classes) Scenarios 1 single initial seed always connected for the duration of the experiment 40 leechers join at the same time (flash crowd) New leechers leave as soon as they becom seed Content: ~100MB file, 400 pieces of 250KB each Three classes of peers (slow/medium/fast) Aim TFT perf. for slow vs fast seed Main metric: clustering of exchanges with peers of the same class (TFT reciprocation)

TFT, fast seed [2] We see clusters per class Two artifacts 20 19 Slow class squares are darker since download takes longer Peer 27 slower than other peers in its class (problem with a PlanetLab node) 20 19 Peer 27 seed slow medium fast

Sharing incentive, fast seed [2] Fast peers complete close to optimal completion time Vertical black line In order to be unchoked by fast peers, peers need to reciprocate The more you contribute the faster you complete Effective sharing incentive Well-provisioned initial seed Cluster formation, effective sharing incentive, good upload utilization

TFT, slow seed [2] We no longer see clusters Reasons seed Slow source, capacity bottleneck become content bottleneck Fast peers reciprocate with slower peers, when they are not interested in any other fast peer seed

Sharing incentive, slow seed [2] Earliest completion time longer than with a fast seed Vertical black line Most peers complete close to the earliest completion time Choking algorithm does not provide effective sharing incentive when the seed is underprovisioned Underprovisioned initial seed No clustering, ineffective sharing incentives, Seed provisioning critical : at least as fast as the fastest peer for TFT efficiency as well

Free riding (a) Tit-for-that / Choke Enforce reciprocation, prevent free-riding Still open to some exploit [7,8,9,10] Free riding definitions In economics, "free riders" are those who consume more than their fair share of a resource, or contribute less than a fair share of the costs of its production. BitTorrent free-riding upload < download [7,8,9] no upload [10] Overall results Single peer perspective: free riding is possible [9,10] BitThief [10] Opportunistically connects to a very large number of peers Get seeds RU, leecher OU 0 upload BitTyrant [9] Similar techniques Also exploits leecher RU Provides > 0 upload Whole system perspective Few free riders have limited impact on the whole system [7] However, performance hurted when all peers use BitTyrant [9]

Protocol overhead Overhead due to Useful payload (P) Overhead Signaling (control messages) Packetization (40B TCP/IP header) Useful payload (P) Bytes rx/tx in a PIECE (P) message (without L7 headers) Overhead Ratio of non-payload bytes over the total amount of bytes (overhead + payload) BitTorrent protocol overhead ~3% of bytewise overhead for most of the experiments Signaling overhead = many packets but very few bytes Packetization overhead = 40B /1500B MSS = 2.6% Control messages that account for most of the signaling overhead HAVE, REQUEST, BITFIELD

References Optional readings: [1] A. R. Bharambe, C. Herley, and V. N. Padmanabhan "Analyzing and Improving a BitTorrent Network’s Performance Mechanisms". IEEE INFOCOM’06, Barcelona, Spain, April 2006. [2] A. Legout, N. Liogkas, E. Kohler, and L. Zhang “Clustering and Sharing Incentives in BitTorrent Systems”, ACM SIGMETRICS’ 07 For your thirst of knowledge: [3] C. Zhang, P. Dunghel, D. Wu, K.W. Ross, “Unraveling the BitTorrent Ecosystem”. IEEE Transactions on Parallel and Distributed Systems, 2011 [4] X. Yang and G. de Veciana “Service Capacity of Peer to Peer Networks”. IEEE INFOCOM’04 [5] D. Qiu and R. Srikant “Modeling and Performance Analysis of BitTorrent-Like Peer-to-Peer Networks”. ACM SIGCOMM’04 [6] P. Rodriguez, E. W. Biersack “Dynamic Parallel Access to Replicated Content in the Internet”. IEEE INFOCOM’00 [7] N. Liogkas, R. Nelson, E. Kohler, and L. Zhang. "Exploiting Bittorrent For Fun (But Not Profit)". In Proc. of USENIX IPTPS’06, Santa Barbara, CA, February 2006 [8] S. Jun and M. Ahamad."Icentives in BitTorrent Induce Free Riding". In Proc. of the Workshop on Economics of Peer-to-Peer Systems (P2PEcon’05), Philadelphia, PA, August 2005. [9] T. Locher, P. Moor, S. Schmid, and R.Wattenhofer. "Free Riding in BitTorrent is Cheap." In Proc. of ACM HotNets-V, Irvine, CA, November 2006. [10] M. Piatek, T. Isdal, T. Anderson, A. Krishnamurthy, A. Venkataramani, Do Incentives Build Robustness in BitTorrent? In Proc of USENIX NSDI 2007

?? || // 80