Presentation is loading. Please wait.

Presentation is loading. Please wait.

The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.

Similar presentations


Presentation on theme: "The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006."— Presentation transcript:

1 The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006

2 Motivation  flash crowd (aka slashdot) effect many clients, few servers  Problem: servers cannot handle load  Solution: swarming clients download pieces of the file from each other has been proven to have good scaling and performance properties

3 Presentation outline  Joining the system  Encoding / metadata file  Tracker protocol  Peer wire protocol  Piece selection  Peer selection  Client implementations  Resources

4 new leecher Joining a torrent Peers divided into:  seeds: have the entire file  leechers: still downloading data request peer list metadata file join 1 23 4 seed/leecherwebsitetracker 1. obtain the metadata file (out of band) 2. contact the tracker 3. obtain a peer list (contains seeds & leechers) 4. contact peers from that list for data

5 ! Exchanging data I have leecher A ● verify pieces using hashes ● download sub-pieces (blocks) in parallel ● advertise received pieces to the entire peer list ● interested: need pieces that a given peer has seed leecher Bleecher C

6 Bencoding  encoding format of all exchanged messages  four types byte strings integers lists dictionaries (mapping keys to values)  examples 4:spam represents the string “spam” i10e represents the integer 10

7 Metadata file structure  contains information necessary to contact the tracker and describes the files in the torrent announce URL of tracker file name file length piece length (typically 256KB) SHA-1 hashes of pieces for verification also creation date, comment, creator, …

8 Tracker protocol  communicates with clients via HTTP/HTTPS  client GET request info_hash: uniquely identifies the file peer_id: chosen by and uniquely identifies the client client IP and port numwant: how many peers to return (defaults to 50) stats: bytes uploaded, downloaded, left  tracker GET response interval: how often to contact the tracker list of peers, containing peer id, IP and port stats: complete, incomplete  tracker-less mode; based on the Kademlia DHT

9 Presentation outline  Joining the system  Encoding / metadata file  Tracker protocol  Peer wire protocol  Piece selection  Peer selection  Client implementations  Resources

10 Peer wire protocol  implemented directly on top of TCP  messages handshake (maybe with bitfield) keep-alive choke / unchoke interested / not interested have (advertisement of a newly acquired piece) request / piece cancel (only used in “endgame mode”) port (used in tracker-less mode)

11 Piece selection  when downloading starts: choose at random get complete pieces as quickly as possible obtain something to offer to others  after we have 4 pieces: pick (local) rarest first achieves the fastest replication of rare pieces obtain something of value only get unique pieces from the seed  endgame mode defense against the “last-block problem” send requests for missing sub-pieces to all peers in our peer list send cancel messages upon receipt of a sub-piece

12 Last-block problem  at the end of the download, a peer may have trouble finding the few missing pieces  based on anecdotal evidence  other proposals network coding [Gkantsidis et al., Infocom’05] prefer to upload to peers with similar file completeness; unfair for the peers having most of the pieces [Tian et al., Infocom’06]

13 Last-block problem – a myth?  is it a problem after all?  figure from [Legout et al., INRIA-TR-2006], with permission

14 Peer selection - unchoking leecher Aseed leecher Bleecher C periodically (typically every 10 seconds) calculate data-receiving rates upload to (unchoke) the fastest constant number of unchoking slots based on the “tit-for-tat” strategy

15 Optimistic unchoking  periodically select a peer at random and upload to it typically every 3 unchoking rounds (30 seconds)  multi-purpose mechanism allow bootstrapping of new clients continuously look for the fastest partners robustness: every peer has a non-zero chance of interacting with any other peer

16 Seed unchoking  old algorithm unchoke the fastest leechers problem: fastest peers may monopolize seeds  new algorithm  periodically sort all leechers according to their last unchoke time  prefer the most recently unchoked leechers; on a tie, prefer the fastest  (presumably) achieves equal spread of seed bandwidth

17 new list request peer list Downloading only from seeds leecher Aseed leecher Bleecher C tracker ● repeatedly query the tracker for peer lists ● distinguish the seeds, and receive data from them ● violates fairness model; may be harmful to honest peers

18 Rate- vs. volume-based selection  Proponents of rate-based decisions: [Cohen, P2PECON’03], and [INRIA TR’2006]  Proponents of volume-based decisions: [Bharambe et al., MSR-TR-2005], [Gkantsidis et al., Infocom’05], [Jun et al., P2PECON’05], and eDonkey file-sharing system  No clear winner yet!

19 Client implementations  mainline: written in Python; right now, the only one employing the new seed unchoking algorithm  Azureus: the most popular, written in Java; implements a special protocol between clients (e.g. peers can exchange peer lists)  other popular clients: ABC, BitComet, BitLord, BitTornado, μTorrent, Opera browser  various non-standard extensions retaliation mode: detect compromised/malicious peers anti-snubbing: ignore a peer who ignores us super seeding: seed masquerading as a leecher

20 Resources #1  Basic BitTorrent mechanisms [Cohen, P2PECON’03]  BitTorrent specification Wiki http://wiki.theory.org/BitTorrentSpecification  Measurement studies [Izal et al., PAM’04], [Pouwelse et al., Delft TR 2004 and IPTPS’05], [Guo et al., IMC’05], and [Legout et al., INRIA-TR-2006]

21 Resources #2  Theoretical analysis and modeling [Qiu et al., SIGCOMM’04], and [Tian et al., Infocom’06]  Simulations [Bharambe et al., MSR-TR-2005]  Sharing incentives and exploiting them [Shneidman et al., PINS’04], [Jun et al., P2PECON’05], and [Liogkas et al., IPTPS’06]

22 Conclusion and food for thought  BitTorrent is fast and robust  Yet, many parameters are arbitrarily set number of unchoking slots unchoking round duration size of pieces / sub-pieces  What can we learn from BitTorrent for the design of future P2P content distribution protocols?


Download ppt "The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006."

Similar presentations


Ads by Google