Peer-to-Peer (P2P) Networks

Slides:



Advertisements
Similar presentations
The BitTorrent Protocol
Advertisements

The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
Incentives Build Robustness in BitTorrent- Bram Cohen Presented by Venkatesh Samprati.
The BitTorrent protocol A peer-to-peer file sharing protocol.
Incentives Build Robustness in BitTorrent Bram Cohen.
Bit Torrent (Nick Feamster) February 25, BitTorrent Steps for publishing – Peer creates.torrent file and uploads to a web server: contains metadata.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer-to-Peer Networks as a Distribution and Publishing Model Jorn De Boever (june 14, 2007)
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Paul Solomine Security of P2P Systems. P2P Systems Used to download copyrighted files illegally. The RIAA is watching you… Spyware! General users become.
P2P Network is good or bad? Sang-Hyun Park. P2P Network is good or bad? - Definition of P2P - History of P2P - Economic Impact - Benefits of P2P - Legal.
Modeling and Performance Analysis of Bitorrent-Like Peer-to-Peer Networks Dongyu Qiu and R. Srikant University of Illinois, 2004 Presented by : Ran Zivhon.
Peer-to-Peer Intro Jani & Sami Peltotalo.
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
A P2P file distribution system ——BitTorrent Fan Bin Sep,25,2004.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
The Bittorrent Protocol
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
Introduction Widespread unstructured P2P network
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Content Overlays (Nick Feamster) February 25, 2008.
Bit Torrent (Nick Feamster) February 25, BitTorrent Steps for publishing – Peer creates.torrent file and uploads to a web server: contains metadata.
BitTorrent.
BitTorrent Internet Technologies and Applications.

Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Introduction of P2P systems
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
BitTorrent Dr. Yingwu Zhu. Bittorrent A popular P2P application for file exchange!
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
An Introduction to Peer-to-Peer Networks Presentation for MIE456 - Information Systems Infrastructure II Vinod Muthusamy October 30, 2003.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu.
FastTrack Network & Applications (KaZaA & Morpheus)
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Peer-to-Peer File Sharing
NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
Peer to Peer Networking. Network Models => Mainframe Ex: Terminal User needs direct connection to mainframe Secure Account driven  administrator controlled.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
November 19, 2016 Guide:- Mrs. Kale J. S. Presented By:- Hamand Amol Sambhaji. Hamand Amol Sambhaji. Pardeshi Dhananjay Rajendra. Pardeshi Dhananjay Rajendra.
An example of peer-to-peer application
Introduction to BitTorrent
BitTorrent Vs Gnutella.
The BitTorrent Protocol
Content Distribution Networks + P2P File Sharing
Content Distribution Networks + P2P File Sharing
Presentation transcript:

Peer-to-Peer (P2P) Networks Dr. Yingwu Zhu

Overview CS Architecture P2P Architecture Unstructured P2P Networks Napster, Gnutella, KaZza, Freenet BitTorrent Structured P2P Networks Chord, Pastry, Tapestry, CAN Won’t be covered here!

Client/Sever Architecture Well known, powerful, reliable server is a data source Clients request data from server Very successful model WWW (HTTP), FTP, Web services, etc. As more clients are added, the demand on the server increases!!!

Client/Server Limitations Scalability is hard to achieve Presents a single point of failure Requires administration Unused resources at the network edge CPU cycles, storage, etc. P2P systems try to address these limitations

Why Study P2P? Huge fraction of traffic on networks today >=50%! Exciting new applications Next level of resource sharing Vs. timesharing, client-server, P2P E.g. Access 10’s-100’s of TB at low cost.

Users and Usage 60M users of file-sharing in US 8.5M logged in at a given time on average 814M units of music sold in US last year 140M digital tracks sold by music companies As of Nov, 35% of all Internet traffic was for BitTorrent, a single file-sharing system Major legal battles underway between recording industry and file-sharing companies

Share of Internet Traffic

Number of Users Others include BitTorrent, eDonkey, iMesh, Overnet, Gnutella BitTorrent (and others) gaining share from FastTrack (Kazaa).

P2P Computing P2P computing is the sharing of computer resources and services by direct exchange between systems. These resources and services include the exchange of information, processing cycles, cache storage, and disk storage for files. P2P computing takes advantage of existing computing power, computer storage and networking connectivity, allowing users to leverage their collective power to the “benefit” of all.

P2P Architecture All nodes are both clients and servers Provide and consume data Any node can initiate a connection No centralized data source “The ultimate form of democracy on the Internet” “The ultimate threat to copy-right protection on the Internet

What is P2P? A distributed system architecture No centralized control Typically many nodes, but unreliable and heterogeneous Nodes are symmetric in function Take advantage of distributed, shared resources (bandwidth, CPU, storage) on peer-nodes Fault-tolerant, self-organizing Operate in dynamic environment, frequent join and leave is the norm Internet

P2P Network Characteristics Clients are also servers and routers Nodes contribute content, storage, memory, CPU Nodes are autonomous (no administrative authority) Network is dynamic: nodes enter and leave the network “frequently” Nodes collaborate directly with each other (not through well-known servers) Nodes have widely varying capabilities

P2P vs. Client/Server Pure P2P: As more peers are added, both demand No central server For certain requests any peer can function as a client, as a router, or as a server The information is not located in a central location but is distributed among all peers A peer may need to communicate with multiple peers to locate a piece of information As more peers are added, both demand and capacity of the network increases !

P2P Benefits Efficient use of resources Scalability Reliability Unused bandwidth, storage, processing power at the edge of the network Scalability Consumers of resources also donate resources Aggregate resources grow naturally with utilization Reliability Replicas Geographic distribution No single point of failure Ease of administration Nodes self organize No need to deploy servers to satisfy demand (c.f. scalability) Built-in fault tolerance, replication, and load balancing

P2P Traffics P2P networks generate more traffic than any other internet application 2/3 of all bandwidth on some backbones

P2P Data Flow CacheLogic P2P file format analysis (2005) Streamsight used for Layer-7 Deep Packet Inspection

Category of P2P Systems Unstructured Structured No restriction on overlay structures and data placement Napster, Gnutella, Kazza, Freenet, Bittorrent Structured Distributed hash tables (DHTs) Place restrictions on overlay structures and data placement Chord, Pastry, Tapestry, CAN

Napster Share Music files, MP3 data Nodes register their contents (list of files) and IPs with server Centralized server for searches The client sends queries to the centralized server for files of interest Keyword search (artist, song, album, bitrate, etc.) Napster server replies with IP address of users with matching files File download done on a peer to peer basis Poor scalability Single point of failure Legal issues  shutdown Client Client Server Reply Query File Transfer

Napster: Publish insert(X, 123.2.21.23) ... Publish I have X, Y, and Z! 123.2.21.23

Napster: Search 123.2.0.18 Fetch search(A) --> 123.2.0.18 Query Reply Where is file A?

Napster Central Napster server Search is centralized Can ensure correct results Bottleneck for scalability Single point of failure Susceptible to denial of service Malicious users Lawsuits, legislation Search is centralized File transfer is direct (peer-to-peer)

Gnutella: Query Flooding Breadth-First Search (BFS) = source = forward query = processed = found result = forward response

Gnutella: Query Flooding A node/peer connects to a set of Gnutella neighbors Forward queries to neighbors Client which has the Information responds. Flood network with TTL for termination + Results are complete – Bandwidth wastage

Gnutella vs. Napster Decentralized Flooding queries No single point of failure Not as susceptible to denial of service Cannot ensure correct results Flooding queries Search is now distributed but still not scalable

Gnutella: Random Walk Improved over query flooding Same overly structure to Gnutella Forward the query to random subset of it neighbors + Reduced bandwidth requirements – Incomplete results – High latency Peer nodes

Kazza (Fasttrack Networks) Hybrid of centralized Napster and decentralized Gnutella Super-peers act as local search hubs Each super-peer is similar to a Napster server for a small portion of the network Super-peers are automatically chosen by the system based on their capacities (storage, bandwidth, etc.) and availability (connection time) Users upload their list of files to a super-peer Super-peers periodically exchange file lists You send queries to a super-peer for files of interest The local super-peer may flood the queries to other super-peers for the files of interest, if it cannot satisfy the queries. Exploit the heterogeneity of peer nodes

Kazza Uses supernodes to improve scalability, establish hierarchy Uptime, bandwidth Closed-source Uses HTTP to carry out download Encrypted protocol; queuing, QoS

KaZaA: Network Design “Super Nodes”

KaZaA: File Insert insert(X, 123.2.21.23) ... Publish I have X!

KaZaA: File Search search(A) --> 123.2.0.18 123.2.22.50 Replies Query Where is file A?

Freenet Data flows in reverse path of query Impossible to know if a user is initiating or forwarding a query Impossible to know if a user is consuming or forwarding data “Smart” queries n Requests get routed to correct peer by incremental discovery

Bittorrent A popular P2P application for file exchange!

Problems to Address Traditional Client/Server Sharing Performance deteriorates rapidly as the number of clients increases Free-riding in P2P network Free riders only download without contributing to the network.

Basic Idea Chop file into many pieces A piece is broken into sub-pieces ... typically 16KB in size Policy: Until a piece is assembled, only download sub-pieces for that piece This policy lets complete pieces assemble quickly Replicate DIFFERENT pieces on different peers as soon as possible As soon as a peer has a complete piece, it can trade it with other peers Hopefully, we will be able to assemble the entire file at the end

File Organization File 1 2 3 4 Piece 256KB Block 16KB Incomplete Piece

Overall Architecture Tracker Web Server .torrent C A [Seed] Peer Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Web Server .torrent

Overall Architecture Tracker Web Server Get-announce C A [Seed] Peer Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Get-announce Web Server

Overall Architecture Tracker Web Server Response-peer list C A [Seed] Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Response-peer list Web Server

Overall Architecture Tracker Web Server Shake-hand C A [Seed] Peer Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Shake-hand Web Server

Overall Architecture Tracker Web Server C A pieces [Seed] Peer [Leech] Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker pieces Web Server

Overall Architecture Tracker Web Server C A pieces [Seed] Peer [Leech] Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker pieces Web Server

Overall Architecture Tracker Web Server Get-announce Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Get-announce Response-peer list pieces Web Server

Critical Elements 1 A web server To provide the ‘metainfo’ file by HTTP For example: http://bt.btchina.net http://bt.ydy.com/ Web Server The Lord of Ring.torrent Troy.torrent

Critical Elements 2 The .torrent file Static ‘metainfo’ file to contain necessary information : Name Size Checksum IP address (URL) of the Tracker Pieces <hash1,hash2,….hashn> Piece length Matrix.torrent

Critical Elements 3 A BitTorrent tracker Peer cache State information Non-content-sharing node Track peers For example: http://bt.cnxp.com:8080/announce http://btfans.3322.org:6969/announce Peer cache IP, port, peer id State information Completed Downloading Returns random list

Critical Elements 4 An end user (peer) Guys who want to use BitTorrent must install corresponding software or plug-in for web browsers. Downloader (leecher) : Peer has only a part ( or none ) of the file. Seeder: Peer has the complete file, and chooses to stay in the system to allow other peers to download

Messages Peer – Peer messages Peer – Tracker messages TCP Sockets HTTP Request/Response

BitTorrent – joining a torrent metadata file .torrent new leecher website 1 2 join peer list 3 data request seed/leecher tracker 4 Peers divided into: seeds: have the entire file leechers: still downloading 1. obtain the metadata file 2. contact the tracker 3. obtain a peer list (contains seeds & leechers) 4. contact peers from that list for data

BitTorrent – exchanging data leecher B leecher A I have ! seed leecher C ● Verify pieces using hashes ● Download sub-pieces in parallel ● Advertise received pieces to the entire peer list ● Look for the rarest pieces

BitTorrent - unchoking leecher B leecher A seed leecher C leecher D ● Periodically calculate data-receiving rates ● Upload to (unchoke) the fastest downloaders ● Optimistic unchoking periodically select a peer at random and upload to it continuously look for the fastest partners

Demo … HTTP GET MYFILE.torrent MYFILE.torrent http://mytracker.com:6969/ S3F5YHG6FEB FG5467HGF367 F456JI9N5FF4E … MYFILE.torrent webserver “register” ID1 169.237.234.1:6881 ID2 190.50.34.6:5692 ID3 34.275.89.143:4545 … ID50 231.456.31.95:6882 list of peers user tracker Peer 40 … Peer 1 Peer 2

Swarming Pieces and Sub-pieces A piece, typically 256KB is broken into 16KB sub-pieces. Until a piece is assembled, only sub-pieces for that piece is downloaded. This ensures that complete pieces assemble quickly. When transferring data over TCP, it is critical to always have several requests pending at once, to avoid a delay between pieces being sent. At any point in time, some number, typically 5, are requested simultaneously. On piece completion, notify all (neighbor) peers.

Piece Selection The order of pieces is very important for good performance. A bad algorithm could result in all peers waiting for the same missing piece. Random Piece First policy Initially a peer had no pieces to trade, thus important to get a piece ASAP. Policy: Peer starts with a random piece to download. Rarest Piece First policy Policy: Download the pieces which are most rare among your peers. Ensures most common pieces are left for last.

Rarest First Policy Peer 12,7,36 12,7,14 Peer . 14 Peer HAVE <12,7,36> 12,7,36 12,7,14 Peer HAVE <14> . HAVE <12,7,14> Peer 14

End Game mode When all the sub-pieces that a peer doesn’t have are requested, a request is sent to every peer. When the sub-piece arrives, duplicate requests are canceled. This ensures, completion is not prevented due to a slow peer.

Tit-for-Tat Strategy “Give and yet shall receive” Cooperate if the other per cooperates. Chocking mechanism. Choke all peers except top 4 up loaders. Optimistic Un-choke for eventual cooperation and recovery.

Tit-for-Tat Peer 1 Un-choked Peer 2 Choked Peer 1 Peer Peer 2 Slow Upload Peer 1 Choked Peer 2 Un-choked Peer 2

Choking Ensures every nodes cooperate and prevents free-riding problem. Goal is to have several bidirectional connections running continuously. Choking is temporary refusal to upload, downloading occurs as normal. Connection is kept open so that setup costs are not borne again and again. At a given time only 4 best peers are un-choked. Evaluation on whom to choke/un-choke is performed every 10 seconds. Optimistic Un-choke every 30 seconds. Give a chance for newly joined peer to get data to download

Choking Algorithm Goal is to have several bidirectional connections running continuously Upload to peers who have uploaded to you recently Unutilized connections are uploaded to on a trial basis to see if better transfer rates could be found using them

Choking Specifics A peer always unchokes a fixed number of its peers (default of 4) Decision to choke/unchoke done based on current download rates, which is evaluated on a rolling 20-second average Evaluation on who to choke/unchoke is performed every 10 seconds This prevents wastage of resources by rapidly choking/unchoking peers Supposedly enough for TCP to ramp up transfers to their full capacity Which peer is the optimistic unchoke is rotated every 30 seconds

Anti-Snubbing Policy: When over a minute has gone by without receiving a single sub-piece from a particular peer, do not upload to it except as an optimistic unchoke A peer might find itself being simultaneously choked by all its peers that it was just downloading from Download will lag until optimistic unchoke finds better peers Policy: If choked by everyone, increase the number of simultaneous optimistic unchokes to more than one

Up-load only or Seeding mode Once the download is complete, has no download rates to compare, nor requires them. Which node to upload? Policy: Upload to top 4 peers with maximum upload rate. Ensures faster replication.

Structured P2P Networks Distributed Hash Tables (DHTs) put(k,v) and v=get(k) interface Representative systems Chord, Pastry, Tapestry, and CAN