BitTorrent CS514 Vivek Vishnumurthy, TA. Common Scenario Millions want to download the same popular huge files (for free) –ISO’s –Media (the real example!)

Slides:



Advertisements
Similar presentations
The BitTorrent Protocol
Advertisements

The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
Incentives Build Robustness in BitTorrent Author: Bram Cohen Presenter: Brian Liao.
Incentives Build Robustness in BitTorrent- Bram Cohen Presented by Venkatesh Samprati.
The BitTorrent protocol A peer-to-peer file sharing protocol.
Incentives Build Robustness in BitTorrent Bram Cohen.
CS5412: Torrents and Tit-for-Tat
Bit Torrent (Nick Feamster) February 25, BitTorrent Steps for publishing – Peer creates.torrent file and uploads to a web server: contains metadata.
Stochastic Analysis of File Swarming Systems The Chinese University of Hong Kong John C.S. Lui Collaborators: D.M. Chiu, M.H. Lin, B. Fan.
The BitTorrent content distribution system CS217 Advanced Topics in Internet Research Guest Lecture Nikitas Liogkas, 5/11/2006.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer to Peer (P2P) Networks and File sharing. By: Ryan Farrell.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XV: Real P2P Systems.
Game Theory Presented by Hakim Weatherspoon. Game Theory Main Question: Can we cheat (and get away with it)? BitTorrent –P2P file distribution tool designed.
Ken Birman Cornell University. CS5410 Fall
Game Theory Presented by Hakim Weatherspoon. Game Theory BitTorrent Do Incentives Build Robustness in BitTorrent? BAR Gossip.
FRIENDS: File Retrieval In a dEcentralized Network Distribution System Steven Huang, Kevin Li Computer Science and Engineering University of California,
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Paul Solomine Security of P2P Systems. P2P Systems Used to download copyrighted files illegally. The RIAA is watching you… Spyware! General users become.
Peer-to-Peer (or P2P) From user to user. Peer-to-peer implies that either side can initiate a session and has equal responsibility. Corey Chan Andrew Merfeld.
A P2P file distribution system ——BitTorrent Fan Bin Sep,25,2004.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Incentives Build Robustness in BitTorrent 1st Workshop on Economics of Peer-to-Peer Systems 2003 Bram Cohen
P2P WeeSan Lee
BitTorrent Background. Common Scenario Millions want to download the same popular huge files (for free) –ISO’s –Media (the real example!) Client-server.
Middleware for P2P architecture Jikai Yin, Shuai Zhang, Ziwen Zhang.
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
Client-Server vs P2P or, HTTP vs Bittorrent. Client-Server Architecture SERVER client.
The Bittorrent Protocol
P2P File Sharing Systems
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Content Overlays (Nick Feamster) February 25, 2008.
Bit Torrent (Nick Feamster) February 25, BitTorrent Steps for publishing – Peer creates.torrent file and uploads to a web server: contains metadata.
Network Layer (3). Node lookup in p2p networks Section in the textbook. In a p2p network, each node may provide some kind of service for other.
BitTorrent Presentation by: NANO Surmi Chatterjee Nagakalyani Padakanti Sajitha Iqbal Reetu Sinha Fatemeh Marashi.
1 V1-Filename.ppt / yyyy-mm-dd / Initials P2P content distribution T Applications and Services in Internet, Fall 2008 Jukka K. Nurminen.
Peer to Peer Network Anas Hardan. What is a Network? What is a Network? A network is a group of computers and other devices (such as printers) that are.
BitTorrent.
BitTorrent Internet Technologies and Applications.

BitTorrent How it applies to networking. What is BitTorrent P2P file sharing protocol Allows users to distribute large amounts of data without placing.
1 BitTorrent System Efrat Oune Bar-Ilan What is BitTorrent? BitTorrent is a peer-to-peer file distribution system (built for intensive daily use.
P2P Web Standard IS3734/19/10 Michael Radzin. What is P2P? Peer to Peer Networking (P2P) is a “direct communications initiations session.” Modern uses.
BitTorrent Dr. Yingwu Zhu. Bittorrent A popular P2P application for file exchange!
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
Bit Torrent A good or a bad?. Common methods of transferring files in the internet: Client-Server Model Peer-to-Peer Network.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
B IT T ORRENT T ECHNOLOGY Anthony Pervetich. H ISTORY Bram Cohen Designed the BitTorrent protocol in April 2001 Released July 2, 2001 Concept Late 90’s.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
Peer-to-Peer File Sharing
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
Peer to Peer Networking. Network Models => Mainframe Ex: Terminal User needs direct connection to mainframe Secure Account driven  administrator controlled.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
© 2016 A. Haeberlen, Z. Ives CIS 455/555: Internet and Web Systems 1 University of Pennsylvania Decentralized systems February 15, 2016.
Peer-to-Peer Networks 10 Fast Download Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
November 19, 2016 Guide:- Mrs. Kale J. S. Presented By:- Hamand Amol Sambhaji. Hamand Amol Sambhaji. Pardeshi Dhananjay Rajendra. Pardeshi Dhananjay Rajendra.
An example of peer-to-peer application
Introduction to BitTorrent
BitTorrent Vs Gnutella.
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
Content Distribution Networks + P2P File Sharing
Presentation transcript:

BitTorrent CS514 Vivek Vishnumurthy, TA

Common Scenario Millions want to download the same popular huge files (for free) –ISO’s –Media (the real example!) Client-server model fails –Single server fails –Can’t afford to deploy enough servers

IP Multicast? Recall: IP Multicast not a real option in general settings –Not scalable –Only used in private settings Alternatives –End-host based Multicast –BitTorrent –Other P2P file-sharing schemes (later in lecture)

Router “Interested” End-host Source

Router “Interested” End-host Source Client-Server

Router “Interested” End-host Source Client-Server Overloaded!

Router “Interested” End-host Source IP multicast

Router “Interested” End-host Source End-host based multicast

“Single-uploader”  “Multiple-uploaders” –Lots of nodes want to download –Make use of their uploading abilities as well –Node that has downloaded (part of) file will then upload it to other nodes.  Uploading costs amortized across all nodes

End-host based multicast Also called “Application-level Multicast” Many protocols proposed early this decade –Yoid (2000), Narada (2000), Overcast (2000), ALMI (2001) All use single trees Problem with single trees?

End-host multicast using single tree Source

End-host multicast using single tree Source

End-host multicast using single tree Source Slow data transfer

End-host multicast using single tree Tree is “push-based” – node receives data, pushes data to children Failure of “interior”-node affects downloads in entire subtree rooted at node Slow interior node similarly affects entire subtree Also, leaf-nodes don’t do any sending! Though later multi-tree / multi-path protocols (Chunkyspread (2006), Chainsaw (2005), Bullet (2003)) mitigate some of these issues

BitTorrent Written by Bram Cohen (in Python) in 2001 “Pull-based” “swarming” approach –Each file split into smaller pieces –Nodes request desired pieces from neighbors As opposed to parents pushing data that they receive –Pieces not downloaded in sequential order –Previous multicast schemes aimed to support “streaming”; BitTorrent does not Encourages contribution by all nodes

BitTorrent Swarm Swarm –Set of peers all downloading the same file –Organized as a random mesh Each node knows list of pieces downloaded by neighbors Node requests pieces it does not own from neighbors –Exact method explained later

How a node enters a swarm for file “popeye.mp4” File popeye.mp4.torrent hosted at a (well-known) webserver The.torrent has address of tracker for file The tracker, which runs on a webserver as well, keeps track of all peers downloading file

How a node enters a swarm for file “popeye.mp4” Peer 1 popeye.mp4.torrent File popeye.mp4.torrent hosted at a (well-known) webserver The.torrent has address of tracker for file The tracker, which runs on a webserver as well, keeps track of all peers downloading file

How a node enters a swarm for file “popeye.mp4” Peer Tracker Addresses of peers 2 File popeye.mp4.torrent hosted at a (well-known) webserver The.torrent has address of tracker for file The tracker, which runs on a webserver as well, keeps track of all peers downloading file

How a node enters a swarm for file “popeye.mp4” Peer Tracker3 Swarm File popeye.mp4.torrent hosted at a (well-known) webserver The.torrent has address of tracker for file The tracker, which runs on a webserver as well, keeps track of all peers downloading file

Contents of.torrent file URL of tracker Piece length – Usually 256 KB SHA-1 hashes of each piece in file –For reliability “files” – allows download of multiple files

Terminology Seed: peer with the entire file –Original Seed: The first seed Leech: peer that’s downloading the file –Fairer term might have been “downloader” Sub-piece: Further subdivision of a piece –The “unit for requests” is a subpiece –But a peer uploads only after assembling complete piece

Peer-peer transactions: Choosing pieces to request Rarest-first: Look at all pieces at all peers, and request piece that’s owned by fewest peers –Increases diversity in the pieces downloaded avoids case where a node and each of its peers have exactly the same pieces; increases throughput –Increases likelihood all pieces still available even if original seed leaves before any one node has downloaded entire file

Choosing pieces to request Random First Piece: –When peer starts to download, request random piece. So as to assemble first complete piece quickly Then participate in uploads –When first complete piece assembled, switch to rarest-first

Choosing pieces to request End-game mode: –When requests sent for all sub-pieces, (re)send requests to all peers. –To speed up completion of download –Cancel request for downloaded sub-pieces

Tit-for-tat as incentive to upload Want to encourage all peers to contribute Peer A said to choke peer B if it (A) decides not to upload to B Each peer (say A) unchokes at most 4 interested peers at any time –The three with the largest upload rates to A Where the tit-for-tat comes in –Another randomly chosen (Optimistic Unchoke) To periodically look for better choices

Anti-snubbing A peer is said to be snubbed if each of its peers chokes it To handle this, snubbed peer stops uploading to its peers  Optimistic unchoking done more often –Hope is that will discover a new peer that will upload to us

Why BitTorrent took off Better performance through “pull-based” transfer –Slow nodes don’t bog down other nodes Allows uploading from hosts that have downloaded parts of a file –In common with other end-host based multicast schemes

Why BitTorrent took off Practical Reasons (perhaps more important!) –Working implementation (Bram Cohen) with simple well-defined interfaces for plugging in new content –Many recent competitors got sued / shut down Napster, Kazaa –Doesn’t do “search” per se. Users use well-known, trusted sources to locate content Avoids the pollution problem, where garbage is passed off as authentic content

Pros and cons of BitTorrent Pros –Proficient in utilizing partially downloaded files –Discourages “freeloading” By rewarding fastest uploaders –Encourages diversity through “rarest-first” Extends lifetime of swarm Works well for “hot content”

Pros and cons of BitTorrent Cons –Assumes all interested peers active at same time; performance deteriorates if swarm “cools off” –Even worse: no trackers for obscure content

Pros and cons of BitTorrent Dependence on centralized tracker: pro/con? –  Single point of failure: New nodes can’t enter swarm if tracker goes down –Lack of a search feature Prevents pollution attacks  Users need to resort to out-of-band search: well known torrent-hosting sites / plain old web-search

“Trackerless” BitTorrent To be more precise, “BitTorrent without a centralized-tracker” E.g.: Azureus Uses a Distributed Hash Table (Kademlia DHT) Tracker run by a normal end-host (not a web- server anymore) –The original seeder could itself be the tracker –Or have a node in the DHT randomly picked to act as the tracker

Why is (studying) BitTorrent important? (From CacheLogic, 2004)

Why is (studying) BitTorrent important? BitTorrent consumes significant amount of internet traffic today –In 2004, BitTorrent accounted for 30% of all internet traffic (Total P2P was 60%), according to CacheLogic –Slightly lower share in 2005 (possibly because of legal action), but still significant –BT always used for legal software (linux iso) distribution too –Recently: legal media downloads (Fox)

Other file-sharing systems Prominent earlier: Napster, Kazaa, Gnutella Current popular file-sharing client: eMule –Connects to the ed2k and Kad networks –ed2k has a supernode-ish architecture (distinction between servers and normal clients) –Kad based on the Kademlia DHT

File-sharing systems… (Anecdotally) Better than BitTorrent in finding obscure items Vulnerable to: –Pollution attacks: Garbage data inserted with the same file name; hard to distinguish –Index-poisoning attacks (sneakier): Insert bogus entries pointing to non-existant files –Kazaa reportedly has more than 50% pollution + poisoning

References BitTorrent –“Incentives build robustness in BitTorrent”, Bram Cohen –BitTorrent Protocol Specification: Poisoning/Pollution in DHT’s: –“Index Poisoning Attack in P2P file sharing systems” –“Pollution in P2P File Sharing Systems”