Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101

Slides:



Advertisements
Similar presentations
Professor Yashar Ganjali Department of Computer Science University of Toronto
Advertisements

The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
The BitTorrent protocol A peer-to-peer file sharing protocol.
Incentives Build Robustness in BitTorrent Bram Cohen.
Peer-to-Peer (P2P) Networks
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Peer-to-Peer Architecture Steve Ko Computer Sciences and Engineering University at Buffalo.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer to Peer (P2P) Networks and File sharing. By: Ryan Farrell.
No Class on Friday There will be NO class on: FRIDAY 1/30/15.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Presented by Stephen Kozy. Presentation Outline Definition and explanation Comparison and Examples Advantages and Disadvantages Illegal and Legal uses.
CSE 124 Networked Services Fall 2009 B. S. Manoj, Ph.D 11/03/2009CSE 124 Network Services FA 2009 Some of these.
1 Peer-to-Peer Applications Reading: 9.4 COS 461: Computer Networks Spring 2008 (MW 1:30-2:50 in COS 105) Jennifer Rexford Teaching Assistants: Sunghwan.
Winter 2008 P2P1 Peer-to-Peer Networks: Unstructured and Structured What is a peer-to-peer network? Unstructured Peer-to-Peer Networks –Napster –Gnutella.
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
BitTorrent Internet Technologies and Applications.

1 BitTorrent System Efrat Oune Bar-Ilan What is BitTorrent? BitTorrent is a peer-to-peer file distribution system (built for intensive daily use.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
P2P Web Standard IS3734/19/10 Michael Radzin. What is P2P? Peer to Peer Networking (P2P) is a “direct communications initiations session.” Modern uses.
Introduction of P2P systems
BitTorrent Dr. Yingwu Zhu. Bittorrent A popular P2P application for file exchange!
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
The Start Shawn Fanning (19-yr-old student nicknamed Napster) developed the original Napster application and service in January 1999 while a freshman.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
OVERVIEW Lecture 8 Distributed Hash Tables. Hash Table r Name-value pairs (or key-value pairs) m E.g,. “Mehmet Hadi Gunes” and m E.g.,
Peer-to-Peer Networks Hongli Luo CEIT, IPFW. r Topics m Application architecture m P2P file sharing m P2P networks: Napster Gnutella KaAzA Bittorrent.
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
Peer-to-Peer Systems.  Quickly grown in popularity:  Dozens or hundreds of file sharing applications  In 2004: 35 million adults used P2P networks.
2: Application Layer 1 Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April.
Professor Yashar Ganjali Department of Computer Science University of Toronto
Hui Zhang, Fall Computer Networking CDN and P2P.
Peer-to-Peer File Sharing
NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
Peer to Peer Networking. Network Models => Mainframe Ex: Terminal User needs direct connection to mainframe Secure Account driven  administrator controlled.
1 Overlay Networks. 2 Routing overlays –Experimental versions of IP (e.g., 6Bone) –Multicast (e.g., MBone and end-system multicast) –Robust routing (e.g.,
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
November 19, 2016 Guide:- Mrs. Kale J. S. Presented By:- Hamand Amol Sambhaji. Hamand Amol Sambhaji. Pardeshi Dhananjay Rajendra. Pardeshi Dhananjay Rajendra.
An example of peer-to-peer application
Introduction to BitTorrent
CPE 401/601 Computer Network Systems
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
CPE 401/601 Computer Network Systems
Pure P2P architecture no always-on server
Chapter 2 Application Layer
Content Distribution Networks + P2P File Sharing
CSE 486/586 Distributed Systems Peer-to-Peer Architecture --- 1
Presentation transcript:

Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101

2 Server Distributing a Large File d1d1 F bits d2d2 d3d3 d4d4 upload rate u s download rates d i Internet

3 Server Distributing a Large File Sending an F-bit file to N receivers – Transmitting NF bits at rate u s – … takes at least NF/u s time Receiving the data at the slowest receiver – Slowest receiver has download rate d min = min i {d i } – … takes at least F/d min time Download time: max{NF/u s, F/d min }

4 Speeding Up the File Distribution Increase the server upload rate – Higher link bandwidth at the server – Multiple servers, each with their own link Alternative: have the receivers help – Receivers get a copy of the data – … and redistribute to other receivers – To reduce the burden on the server

5 Peers Help Distributing a Large File d1d1 F bits d2d2 d3d3 d4d4 upload rate u s download rates d i Internet u1u1 u2u2 u3u3 u4u4 upload rates u i

6 Peers Help Distributing a Large File Components of distribution latency – Server must send each bit: min time F/u s – Slowest peer must receive each bit: min time F/d min Upload time using all upload resources – Total number of bits: NF – Total upload bandwidth u s + sum i (u i ) Total: max{F/u s, F/d min, NF/(u s +sum i (u i ))}

7 Peer-to-Peer is Self-Scaling Download time grows slowly with N – Client-server: max{NF/u s, F/d min } – Peer-to-peer: max{F/u s, F/d min, NF/(u s +sum i (u i ))} But… – Peers may come and go – Peers need to find each other – Peers need to be willing to help each other

8 Locating the Relevant Peers Three main approaches – Central directory (Napster) – Query flooding (Gnutella) – Hierarchical overlay (Kazaa, modern Gnutella) Design goals – Scalability – Simplicity – Robustness – Plausible deniability

Peer-to-Peer Networks: Napster Napster history: the rise – 1/99: Napster version 1.0 – 5/99: company founded – 12/99: first lawsuits – 2000: 80 million users Napster history: the fall – Mid 2001: out of business due to lawsuits – Mid 2001: dozens of decentralized P2P alternatives – 2003: growth of pay services like iTunes 9 Shawn Fanning, Northeastern freshman

Napster Directory Service Client contacts Napster (via TCP) – Provides a list of music files it will share – … and Napster’s central server updates the directory Client searches on a title or performer – Napster identifies online clients with the file – … and provides their IP addresses Client requests the file from the chosen supplier – Supplier transmits the file to the client – Both client and supplier report status to Napster 10

11 Napster Properties Server’s directory continually updated – Always know what music is currently available – Point of vulnerability for legal action Peer-to-peer file transfer – No load on the server – Plausible deniability for legal action (but not enough) Bandwidth – Suppliers ranked by apparent bandwidth and response time

Napster: Limitations of Directory Single point of failure Performance bottleneck Copyright infringement So, later P2P systems were more distributed – Gnutella went to the other extreme… 12 File transfer is decentralized, but locating content is highly centralized

13 Peer-to-Peer Networks: Gnutella Gnutella history – 2000: J. Frankel & T. Pepper released Gnutella – Soon after: many other clients (e.g., Morpheus, Limewire, Bearshare) – 2001: protocol enhancements, e.g., “ultrapeers” Query flooding – Join: contact a few nodes to become neighbors – Publish: no need! – Search: ask neighbors, who ask their neighbors – Fetch: get file directly from another node

Gnutella: Search by Flooding xyz.mp3 ? xyz.mp3 Flooding 14 search

Gnutella: Search by Flooding xyz.mp3 ? xyz.mp3 Flooding 15 search

Gnutella: Search by Flooding transfer 16

Gnutella: Pros and Cons Advantages – Fully decentralized – Search cost distributed – Processing per node permits powerful search semantics Disadvantages – Search scope may be quite large – Search time may be quite long – High overhead, and nodes come and go often 17

Peer-to-Peer Networks: KaAzA KaZaA history – 2001: created by Dutch company (Kazaa BV) – Single network called FastTrack used by other clients as well – Eventually protocol changed so others could no longer use it Super-node hierarchy – Join: on start, the client contacts a super-node – Publish: client sends list of files to its super-node – Search: queries flooded among super-nodes – Fetch: get file directly from one or more peers 18

KaZaA: Motivation for Super-Nodes Query consolidation – Many connected nodes may have only a few files – Propagating query to a sub-node may take more time than for the super-node to answer itself Stability – Super-node selection favors nodes with high up- time – How long you’ve been on is a good predictor of how long you’ll be around in the future 19

Peer-to-Peer Networks: BitTorrent BitTorrent history – 2002: B. Cohen debuted BitTorrent Emphasis on efficient fetching, not searching – Distribute same file to many peers – Single publisher, many downloaders Preventing free-loading – Incentives for peers to contribute 20

BitTorrent: Simultaneous Downloads Divide file into many chunks (e.g., 256 KB) – Replicate different chunks on different peers – Peers can trade chunks with other peers – Peer can (hopefully) assemble the entire file Allows simultaneous downloading – Retrieving different chunks from different peers – And uploading chunks to peers – Important for very large files 21

BitTorrent: Tracker Infrastructure node – Keeps track of peers participating in the torrent – Peers registers with the tracker when it arrives Tracker selects peers for downloading – Returns a random set of peer IP addresses – So the new peer knows who to contact for data Can have “trackerless” system – Using distributed hash tables (DHTs) 22

23 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker Web Server.torrent

24 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker Get-announce Web Server

25 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker Response-peer list Web Server

26 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker Shake-hand Web Server Shake-hand

27 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker pieces Web Server

28 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker pieces Web Server

29 BitTorrent: Overall Architecture Web page with link to.torrent A B C Peer [Leech] Downloader “US” Peer [Seed] Peer [Leech] Tracker Get-announce Response-peer list pieces Web Server

30 BitTorrent: Chunk Request Order Which chunks to request? – Could download in order – Like an HTTP client does Problem: many peers have the early chunks – Peers have little to share with each other – Limiting the scalability of the system Problem: eventually nobody has rare chunks – E.g., the chunks need the end of the file – Limiting the ability to complete a download Solutions: random selection and rarest first

31 BitTorrent: Rarest Chunk First Which chunks to request first? – The chunk with the fewest available copies – I.e., the rarest chunk first Benefits to the peer – Avoid starvation when some peers depart Benefits to the system – Avoid starvation across all peers wanting a file – Balance load by equalizing # of copies of chunks

Free-Riding in P2P Networks Vast majority of users are free-riders – Most share no files and answer no queries – Others limit # of connections or upload speed A few “peers” essentially act as servers – A few individuals contributing to the public good – Making them hubs that basically act as a server BitTorrent prevent free riding – Allow the fastest peers to download from you – Occasionally let some free loaders download 32

33 Bit-Torrent: Preventing Free-Riding Peer has limited upload bandwidth – And must share it among multiple peers – Tit-for-tat: favor neighbors uploading at highest rate Rewarding the top four neighbors – Measure download bit rates from each neighbor – Reciprocate by sending to the top four peers Optimistic unchoking – Randomly try a new neighbor every 30 seconds – So new neighbor has a chance to be a better partner

34 BitTyrant: Gaming BitTorrent BitTorrent can be gamed, too – Peer uploads to top N peers at rate 1/N – E.g., if N=4 and peers upload at 15, 12, 10, 9, 8, 3 – … peer uploading at rate 9 gets treated quite well Best to be the N th peer in the list, rather than 1 st – Offer just a bit more bandwidth than low-rate peers – And you’ll still be treated well by others BitTyrant software – Uploads at higher rates to higher-bandwidth peers –

Conclusions Finding the appropriate peers – Centralized directory (Napster) – Query flooding (Gnutella) – Super-nodes (KaZaA) BitTorrent – Distributed download of large files – Anti-free-riding techniques Great example of how change can happen so quickly in application-level protocols 35