CSE 486/586 Distributed Systems Peer-to-Peer Architecture --- 1

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

Course on Computer Communication and Networks Lecture 10 Chapter 2; peer-to-peer applications (and network overlays) EDA344/DIT 420, CTH/GU.
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
The BitTorrent protocol A peer-to-peer file sharing protocol.
Incentives Build Robustness in BitTorrent Bram Cohen.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Peer-to-Peer Architecture Steve Ko Computer Sciences and Engineering University at Buffalo.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer to Peer (P2P) Networks and File sharing. By: Ryan Farrell.
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Presented by Stephen Kozy. Presentation Outline Definition and explanation Comparison and Examples Advantages and Disadvantages Illegal and Legal uses.
1 Peer-to-Peer Applications Reading: 9.4 COS 461: Computer Networks Spring 2008 (MW 1:30-2:50 in COS 105) Jennifer Rexford Teaching Assistants: Sunghwan.
Winter 2008 P2P1 Peer-to-Peer Networks: Unstructured and Structured What is a peer-to-peer network? Unstructured Peer-to-Peer Networks –Napster –Gnutella.
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
P2P File Sharing Systems
Introduction Widespread unstructured P2P network
BitTorrent Presentation by: NANO Surmi Chatterjee Nagakalyani Padakanti Sajitha Iqbal Reetu Sinha Fatemeh Marashi.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)

Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
BitTorrent How it applies to networking. What is BitTorrent P2P file sharing protocol Allows users to distribute large amounts of data without placing.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
1 BitTorrent System Efrat Oune Bar-Ilan What is BitTorrent? BitTorrent is a peer-to-peer file distribution system (built for intensive daily use.
Introduction of P2P systems
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
OVERVIEW Lecture 8 Distributed Hash Tables. Hash Table r Name-value pairs (or key-value pairs) m E.g,. “Mehmet Hadi Gunes” and m E.g.,
Peer-to-Peer Networks Hongli Luo CEIT, IPFW. r Topics m Application architecture m P2P file sharing m P2P networks: Napster Gnutella KaAzA Bittorrent.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Indranil Gupta (Indy) Lecture 4 Peer to Peer Systems January 30, 2014 All Slides © IG CS 525 Advanced Distributed Systems Spring 2014.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
Peer-to-Peer Systems.  Quickly grown in popularity:  Dozens or hundreds of file sharing applications  In 2004: 35 million adults used P2P networks.
Peer-to-Peer File Sharing
NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer-to-peer systems (part I) Slides by Indranil Gupta (modified by N. Vaidya)
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
Peer to Peer Networking. Network Models => Mainframe Ex: Terminal User needs direct connection to mainframe Secure Account driven  administrator controlled.
1 Overlay Networks. 2 Routing overlays –Experimental versions of IP (e.g., 6Bone) –Multicast (e.g., MBone and end-system multicast) –Robust routing (e.g.,
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CSE 486/586 Distributed Systems Distributed Hash Tables
An example of peer-to-peer application
Introduction to BitTorrent
BitTorrent Vs Gnutella.
CSE 486/586 Distributed Systems Distributed Hash Tables
EE 122: Peer-to-Peer (P2P) Networks
Determining the Peer Resource Contributions in a P2P Contract
CPE 401/601 Computer Network Systems
Part 4: Peer to Peer - P2P Applications
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
CPE 401/601 Computer Network Systems
CMPE 252A : Computer Networks
Pure P2P architecture no always-on server
CSE 486/586 Distributed Systems Distributed Hash Tables
Chapter 2 Application Layer
Content Distribution Networks + P2P File Sharing
Presentation transcript:

CSE 486/586 Distributed Systems Peer-to-Peer Architecture --- 1 Steve Ko Computer Sciences and Engineering University at Buffalo

Last Time Gossiping Multicast Failure detection

Today’s Question How do we organize the nodes in a distributed system? Up to the 90’s Prevalent architecture: client-server (or master-slave) Unequal responsibilities Now Emerged architecture: peer-to-peer Equal responsibilities Today: studying peer-to-peer as a paradigm (not just as a file-sharing application, but will still use file-sharing as the main example) Learn the techniques and principles

Motivation: Distributing a Large File A client-server architecture can do it… d1 F bits d2 d3 d4 upload rate us Download rates di Internet

Motivation: Distributing a Large File …but sometimes not good enough. Limited bandwidth One server can only serve so many clients. Increase the upload rate from the server-side? Higher link bandwidth at the one server Multiple servers, each with their own link Requires deploying more infrastructure Alternative: have the receivers help Receivers get a copy of the data And then redistribute the data to other receivers To reduce the burden on the server

Motivation: Distributing a Large File Peer-to-peer to help d1 F bits d2 d3 d4 upload rate us Download rates di Internet u1 u2 u3 u4 Upload rates ui

Challenges of Peer-to-Peer Peers come and go Peers are intermittently connected May come and go at any time Or come back with a different IP address How to locate the relevant peers? Peers that are online right now Peers that have the content you want How to motivate peers to stay in system? Why not leave as soon as download ends? Why bother uploading content to anyone else? How to download efficiently? The faster, the better

Locating Relevant Peers Evolution of peer-to-peer Central directory (Napster) Query flooding (Gnutella) Hierarchical overlay (Kazaa, modern Gnutella) Design goals Scalability Simplicity Robustness Plausible deniability

The First: Napster Filename Info about S S S P P P P P P Store a directory, i.e., filenames with peer pointers PennyLane.mp3 Beatles, @ 128.84.92.23:1006 ….. napster.com Servers S S S Client machines (“Peers”) P P P P Store their own files P P

The First: Napster S S S P P P P P P 2. All servers search their lists (ternary tree algo.) Store peer pointers for all files S S napster.com Servers S Peers P P 1. Query Store their own files 3. Response P P P P 4. ping candidates 5. download from best host

The First: Napster Server’s directory continually updated Always know what file is currently available Point of vulnerability for legal action Peer-to-peer file transfer No load on the server Plausible deniability for legal action (but not enough) Proprietary protocol Login, search, upload, download, and status operations No security: cleartext passwords and other vulnerability Bandwidth issues Suppliers ranked by apparent bandwidth & response time Limitations: Decentralized file transfer, but centralized lookup

The Second: Gnutella Complete decentralization P P P P P P P Store their own files Complete decentralization Servants (“Peers”) P P P Also store “peer pointers” P P P P Connected in an overlay graph (== each link is an implicit Internet path)

The Second: Gnutella P P P P TTL=2 P P P Query’s flooded out, ttl-restricted, forwarded only once P P P P TTL=2 P P P

The Second: Gnutella P P P P P P P Successful results QueryHit’s routed on reverse path P P P P P P P

The Second: Gnutella Advantages Disadvantages Fully decentralized Search cost distributed Processing per node permits powerful search semantics Disadvantages Search scope may be quite large Search time may be quite long High overhead, and nodes come and go often

The Third: KaAzA Middle ground between Napster & Gnutella Each peer is either a group leader (super peer) or assigned to a group leader TCP connection between peer and its group leader TCP connections between some pairs of group leaders Group leader tracks the content in all its children

The Third: KaZaA A supernode stores a directory listing (<filename,peer pointer>), similar to Napster servers Supernode membership changes over time Any peer can become (and stay) a supernode, provided it has earned enough reputation Kazaalite: participation level (=reputation) of a user between 0 and 1000, initially 10, then affected by length of periods of connectivity and total number of uploads More sophisticated reputation schemes invented, especially based on economics A peer searches by contacting a nearby supernode

CSE 486/586 Administrivia PA2-B is due on 3/15 (Friday). Right before Spring break Midterm is on 3/13 (Wednesday). PA2A grades are posted. We have recitations today.

Now: BitTorrent Key motivation: popular content Popularity exhibits temporal locality (Flash Crowds) E.g., Slashdot/Digg effect, CNN Web site on 9/11, release of a new movie or game Bram Cohen (the inventor) attended UB. Focused on efficient fetching, not searching Distribute same file to many peers Single publisher, many downloaders Preventing free-loading

Key Feature: Parallel Downloading Divide large file into many pieces Replicate different pieces on different peers A peer with a complete piece can trade with other peers Peer can (hopefully) assemble the entire file Allows simultaneous downloading Retrieving different parts of the file from different peers at the same time And uploading parts of the file to peers Important for very large files System Components Web server Tracker Peers

Tracker Infrastructure node Peers register with the tracker Keeps track of peers participating in the torrent Peers register with the tracker Peer registers when it arrives Peer periodically informs tracker it is still there Tracker selects peers for downloading Returns a random set of peers Including their IP addresses So the new peer knows who to contact for data Can be “trackerless” using DHT

Chunks Large file divided into smaller pieces Fixed-sized chunks Typical chunk size of 256 Kbytes Allows simultaneous transfers Downloading chunks from different neighbors Uploading chunks to other neighbors Learning what chunks your neighbors have Periodically asking them for a list File done when all chunks are downloaded

BitTorrent Protocol Tracker Web Server .torrent C A [Seed] Peer Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Web Server .torrent

BitTorrent Protocol Tracker Web Server Get-announce C A [Seed] Peer Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Get-announce Web Server

BitTorrent Protocol Tracker Web Server Response-peer list C A [Seed] Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Response-peer list Web Server

BitTorrent Protocol Tracker Web Server Shake-hand C A [Seed] Peer Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Shake-hand Web Server

BitTorrent Protocol Tracker Web Server C A pieces [Seed] Peer [Leech] Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker pieces Web Server

BitTorrent Protocol Tracker Web Server C A pieces [Seed] Peer [Leech] Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker pieces Web Server

BitTorrent Protocol Tracker Web Server Get-announce Response-peer list Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Get-announce Response-peer list pieces Web Server

Chunk Request Order Which chunks to request? Could download in order Like an HTTP client does Problem: many peers have the early chunks Peers have little to share with each other Limiting the scalability of the system Problem: eventually nobody has rare chunks E.g., the chunks need the end of the file Limiting the ability to complete a download Solutions: random selection and rarest first

Rarest Chunk First Which chunks to request first? Benefits to the peer The chunk with the fewest available copies I.e., the rarest chunk first Benefits to the peer Avoid starvation when some peers depart Benefits to the system Avoid starvation across all peers wanting a file Balance load by equalizing # of copies of chunks

Preventing Free-Riding Vast majority of users are free-riders Most share no files and answer no queries Others limit # of connections or upload speed A few “peers” essentially act as servers A few individuals contributing to the public good Making them hubs that basically act as a server BitTorrent prevent free riding Allow the fastest peers to download from you Occasionally let some free loaders download

Preventing Free-Riding Peer has limited upload bandwidth And must share it among multiple peers Prioritizing the upload bandwidth: tit for tat Favor neighbors that are uploading at highest rate Rewarding the top four neighbors Measure download bit rates from each neighbor Reciprocates by sending to the top four peers Recompute and reallocate every 10 seconds Optimistic unchoking Randomly try a new neighbor every 30 seconds So new neighbor has a chance to be a better partner

Gaming BitTorrent BitTorrent can be gamed Peer uploads to top N peers at rate 1/N E.g., if N=4 and peers upload at 15, 12, 10, 9, 8, 3 … then peer uploading at rate 9 gets treated quite well Best to be the Nth peer in the list, rather than 1st Offer just a bit more bandwidth than the low-rate peers But not as much as the higher-rate peers And you’ll still be treated well by others BitTyrant software Uploads at higher rates to higher-bandwidth peers http://bittyrant.cs.washington.edu/

BitTorrent Today Significant fraction of Internet traffic Estimated at 30% Though this is hard to measure Problem of incomplete downloads Peers leave the system when done Many file downloads never complete Especially a problem for less popular content Still lots of legal questions remains Further need for incentives

Summary Evolution of peer-to-peer BitTorrent Central directory (Napster) Query flooding (Gnutella) Hierarchical overlay (Kazaa, modern Gnutella) BitTorrent Focuses on parallel download Prevents free-riding Next: Distributed Hash Tables

Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC), Michael Freedman (Princeton), and Jennifer Rexford (Princeton).