Presentation is loading. Please wait.

Presentation is loading. Please wait.

BitTorrent Vs Gnutella.

Similar presentations


Presentation on theme: "BitTorrent Vs Gnutella."— Presentation transcript:

1 BitTorrent Vs Gnutella

2 Presentation Overview
Gnutella. Bittorrent.. Code Review.

3 P2P Peer-to-peer networks Assigned reading
[Cla00] Freenet: A Distributed Anonymous Information Storage and Retrieval System [S+01] Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications

4 Overview P2P Lookup Overview Centralized/Flooded Lookups

5 Peer-to-Peer Networks
Typically each member stores/provides access to content Has quickly grown in popularity Bulk of traffic from/to CMU is Kazaa! Basically a replication system for files Always a tradeoff between possible location of files and searching difficulty Peer-to-peer allow files to be anywhere  searching is the challenge Dynamic member list makes it more difficult What other systems have similar goals? Routing, DNS

6 Centralized Lookup (Napster)
SetLoc(“title”, N4) N3 Client DB N4 Lookup(“title”) Key=“title” Value=MP3 data… N8 N9 N7 N6 O(N) state means its hard to keep the state up to date. Simple, but O(N) state and a single point of failure

7 Flooded Queries (Gnutella)
Lookup(“title”) N3 Client N4 Key=“title” Value=MP3 data… N6 N8 N7 N9 Robust, but worst case O(N) messages per lookup

8 Routed Queries (Freenet, Chord, etc.)
Client N4 Lookup(“title”) Publisher Key=“title” Value=MP3 data… N6 N8 N7 Challenge: can we make it robust? Small state? Actually find stuff in a changing system? Consistent rendezvous point, between publisher and client. N9

9 Overview P2P Lookup Overview Centralized/Flooded Lookups

10 Centralized: Napster Simple centralized scheme  motivated by ability to sell/control How to find a file: On startup, client contacts central server and reports list of files Query the index system  return a machine that stores the required file Ideally this is the closest/least-loaded machine Fetch the file directly from peer

11 Centralized: Napster Advantages: Disadvantages: Simple
Easy to implement sophisticated search engines on top of the index system Disadvantages: Robustness, scalability Easy to sue!

12 Gnutella What Gnutella is? How it works? Its positives and negatives.

13 What is Gnutella? Gnutella uses distributed search protocol.
Each node in a Gnutella network acts as both a client and server. Peer to Peer, decentralized model for file sharing. Any type of file can be shared. Nodes are called “Servents”.

14 What do Servents do? Servents “know” about other Servents.
Act as interfaces through which users can issue queries and view search results. Communicate with other Servents by sending “descriptors”.

15 Descriptors Each descriptor consists of a header and a body.
The header includes (among other things) A descriptor ID number. A Time-To-Live number. The body includes: Port information. IP addresses. Query information. Etc… depending on the descriptor.

16 Header Fields Descriptor ID Payload Descriptor TTL Hops Payload Length
Num. Of Bytes 16 1 4 Descriptor ID Payload descriptor TTL HOPS Payload length

17 Body Fields Port IP Address Number of Files Shared
Number of Kilobytes Shared Optional Pong Data Num. Of Bytes 2 4 L-14 N+2...L-1 N+1 2...N 0...1 Byte offset (Optional) Query Data NUL (0x00) Terminator Search Criteria String Minimum Speed Fields

18 Routing Node forwards Ping and Query descriptors to all nodes connected to it. Except: If descriptor’s TTL is decremented to 0. Descriptor has already been received before. Loop detection is done by storing Descriptor ID’s. Pong and QueryHit descriptors retrace the exact path of their respective Ping and Query descriptors.

19 Routing2 C B A D QueryHit Query Query
Note: Ping works essentially the same way, except that a Pong is sent as the response

20 Joining a Gnutella Network
Servent connects to the network using TCP/IP connection to another servent. Could connect to a friend or acquaintance, or from a “Host-Cache”. Send a Ping descriptor to the network Hopefully, a number of Pongs are received

21 Querying Servent sends Query descriptor to nodes it is connected to.
Queried Servents check to see if they have the file. If query match is found, a QueryHit is sent back to querying node.

22 Downloading a File File data is never transferred over the Gnutella network. Data transferred by direct connection Once a servent receives a QueryHit descriptor, it may initiate the direct download of one of the files described by the descriptor’s Result Set. The file download protocol is HTTP. Example: GET /get/<File Index>/<File Name>/ HTTP/1.0\r\n Connection: Keep-Alive\r\n Range: bytes=0-\r\n User-Agent: Gnutella\r\n3

23 Direct File Download C QueryHit TCP/IP Connection B Query A Query

24 Overall: Simple Protocol. Not a lot of overhead for routing.
Robustness? No central point of failure. However: A file is only available as long as the file-provider is online. Vulnerable to denial-of-service attacks.

25 Overall Cont. Scales poorly: Querying and Pinging generate a lot of unnecessary traffic. Example: If TTL = 10 and each site contacts six other sites. Up to 10^6 messages could be generated. On a slow day, a GnutellaNet would have to move 2.4 gigabytes per second in order to support numbers of users comparable to Napster. On a heavy day, 8 gigabytes per second (Ritter article). Heavy messaging can result in poor performance.

26 The BitTorrent protocol
A peer-to-peer file sharing protocol.

27 Philosophy Author: Bram Cohen Based on Tit-for-tat
Incentive - Uploading while downloading Pieces of files

28 BitTorrent HTTP TCP

29 Overall Architecture Tracker Web Server .torrent C A [Seed] Peer
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Web Server .torrent

30 Overall Architecture Tracker Web Server Get-announce C A [Seed] Peer
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Get-announce Web Server

31 Overall Architecture Tracker Web Server Response-peer list C A [Seed]
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Response-peer list Web Server

32 Overall Architecture Tracker Web Server Hand-shake C A [Seed] Peer
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Hand-shake Web Server

33 Overall Architecture Tracker Web Server C A pieces [Seed] Peer [Leech]
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker pieces Web Server

34 Overall Architecture Tracker Web Server C A pieces [Seed] Peer [Leech]
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker pieces Web Server

35 Overall Architecture Tracker Web Server Get-announce
Web page with link to .torrent A B C Peer [Leech] Downloader “US” [Seed] Tracker Get-announce Response-peer list pieces Web Server

36 Messages Peer – Peer messages. TCP Sockets. Peer – Tracker messages.
HTTP Request/Response.

37 .torrent URL of the tracker. Pieces <hash1,hash2,….hashn>.
Piece length. Name. Length. Files. Path.

38 Tracker Peer cache. IP, port, peer id State information. Completed.
Downloading. Returns random list.

39

40 Peer Operation Choking algorithm Choke/Unchoke. Preferred peers.
Upload to interested peers who are not choking.

41 Peer Operation Verify on receiving complete piece Endgame Behavior
Cancel

42 Strengths Better bandwidth utilization Never before speeds.
Up to 7 MB/s from the Internet. Limit free riding – tit-for-tat. Limit leech attack – coupling upload & download. Ability to resume a download.

43 Drawbacks Small files – latency, overhead. Scalability
Millions of peers – Tracker behavior (uses 1/1000 of bandwidth). Single point of failure. Robustness System progress dependent on altruistic nature of seeds (and peers). Malicious attacks and leeches.

44 Code Language Bittorrent - Python. Gnutella – C++/ Java.
Both of them are open source. Easy to find the open source but impossible to find green.


Download ppt "BitTorrent Vs Gnutella."

Similar presentations


Ads by Google