Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)

Similar presentations


Presentation on theme: "1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)"— Presentation transcript:

1 1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)

2 2 Some network apps r e-mail r web r instant messaging r remote login r P2P file sharing r multi-user network games r streaming stored video clips r voice over IP r real-time video conferencing r VoD (Youtube) r Facebook r Twitter r Cloud r ….

3 3 Application architectures r Client-server r Peer-to-peer (P2P) r Hybrid of client-server and P2P

4 4 Client-server architecture server:  always-on host  permanent IP address  server farms for scaling clients:  communicate with server (speak first)  may have dynamic IP addresses  do not communicate directly with each other client/server

5 5 Pure P2P architecture r no always-on server r arbitrary end systems directly communicate r peers are intermittently connected and change IP addresses Highly scalable but difficult to manage peer-peer

6 6 Hybrid of client-server and P2P Skype  voice-over-IP P2P application  centralized server: finding address of remote party:  client-client connection: direct (not through server) Instant messaging  chatting between two users is (can be) P2P  centralized service: client presence detection/location user registers its IP address with central server when it comes online user contacts central server to find IP addresses of buddies

7 7 Internet apps: application, transport protocols Application e-mail remote terminal access Web file transfer streaming multimedia Internet telephony Application layer protocol SMTP [RFC 2821] Telnet [RFC 854] HTTP [RFC 2616] FTP [RFC 959] HTTP (eg Youtube), RTP [RFC 1889] SIP, RTP, proprietary (e.g., Skype) Underlying transport protocol TCP TCP or UDP typically UDP

8 8 File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) File, size F u s : server upload capacity (bps) u i : peer i upload capacity (bps) d i : peer i download capacity (bps)

9 99 File distribution time: server-client usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r server sequentially sends N copies:  NF/u s time r client i takes F/d i time to download increases linearly in N (for large N) = d cs >= max { NF/u s, F/min(d i ) } i Time to distribute F to N clients using client/server approach

10 10 File distribution time: P2P usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r server must send one copy: F/u s time r client i takes F/d i time to download r NF bits must be downloaded (aggregate)  fastest possible upload rate: u s +  u i d P2P >= max { F/u s, F/min(d i ), NF/(u s +  u i ) } i

11 11 Client-server vs. P2P: example Client upload rate = u, F/u = 1 hour, u s = 10u, d min ≥ u s

12 Overlay Networks r A network whose edges are application links rather than physical links 12

13 13 File distribution: BitTorrent tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain list of peers trading chunks peer r P2P file distribution

14 14 BitTorrent (1) r file divided into 256KB chunks. r peer joining torrent:  registers with tracker to get list of peers, connects to subset of peers (“neighbors”)  has no chunks, but will accumulate them over time r while downloading, peer uploads chunks to other peers. r peers may come and go r once a peer has entire file, it may (selfishly) leave or (altruistically) remain

15 15 BitTorrent (2) Pulling Chunks r at any given time, different peers have different subsets of file chunks r periodically, a peer (Alice) asks each neighbor for list of chunks that they have. r Alice sends requests for her missing chunks  rarest first Sending Chunks: tit-for-tat r Alice sends chunks to four neighbors currently sending her chunks at the highest rate  re-evaluate top 4 every 10 secs r every 30 secs: randomly select another peer, starts sending chunks  “optimistically unchoke”

16 2: Application Layer SSL (7/09) 16 P2P Case study: Skype r inherently P2P: pairs of users communicate. r proprietary application-layer protocol (inferred via reverse engineering) r hierarchical overlay with SNs r Index mapping user names to IP addresses is distributed over SNs Skype clients (SC) Supernode (SN) Skype login server

17 2: Application Layer SSL (7/09) 17 Peers as relays r Problem when both Alice and Bob are behind “NATs”.  NAT prevents an outside peer from initiating a call to inside peer r Solution:  Using Alice’s and Bob’s SNs, a relay not behind NAT is chosen  Each peer initiates session with relay.  Peers can now communicate through NATs via relay

18 18 Finding a better trading partner (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster!

19 2: Application Layer SSL (7/09) 19 P2P: searching for information File sharing (e.g. e-mule) r Index dynamically tracks the locations of files that peers share. r Peers need to tell index what they have. r Peers search index to determine where files can be found Instant messaging r Index maps user names to locations. r When user starts IM application, it needs to inform index of its location r Peers search index to determine IP address of user. “Index” in P2P system: maps information to peer location (location = IP address & port number).

20 20 P2P: centralized index original “Napster” design 1) when peer connects, it informs central server:  IP address  content 2) Alice queries for “Hey Jude” 3) Alice requests file from Bob centralized directory server peers Alice Bob 1 1 1 1 2 3

21 21 P2P: problems with centralized directory r single point of failure r performance bottleneck r copyright infringement: “target” of lawsuit is obvious file transfer is decentralized, but locating content is highly centralized

22 22 Query flooding r fully distributed  no central server r used by Gnutella r Each peer indexes the files it makes available for sharing (and no other files) overlay network: graph r edge between peer X and Y if there’s a TCP connection r all active peers and edges form overlay net r edge: virtual (not physical) link r given peer typically connected with < 10 overlay neighbors

23 2: Application Layer SSL (7/09) 23 Query flooding Query QueryHit Query QueryHit Query QueryHit File transfer: HTTP r Query message sent over existing TCP connections r peers forward Query message r QueryHit sent over reverse path Scalability: limited scope flooding

24 24 Gnutella: Peer joining 1. joining peer Alice must find another peer in Gnutella network: use list of candidate peers 2. Alice sequentially attempts TCP connections with candidate peers until connection setup with Bob 3. Flooding: Alice sends Ping message to Bob; Bob forwards Ping message to his overlay neighbors (who then forward to their neighbors….) r peers receiving Ping message respond to Alice with Pong message 4. Alice receives many Pong messages, and can then setup additional TCP connections

25 25 Hierarchical Overlay r Hybrid of centralized index, query flooding approaches r each peer is either a super node or assigned to a super node  TCP connection between peer and its super node.  TCP connections between some pairs of super nodes. r Super node tracks content in its children

26 Next: Distributed Hash Table r Another search method in P2P systems (as well as many other systems) r Please read the paper “Chord” and write a paper review. 26

27 27 Distributed Hash Table (DHT) r DHT = distributed P2P database r Database has (key, value) pairs;  key: ss number; value: human name  key: content type; value: IP address r Peers query database with key  database returns values that match the key r Peers can also insert (key, value) pairs into database r Finding “needles” requires that the P2P system be structured

28 28 DHT Identifiers r Assign integer identifier to each peer in range [0,2 n -1].  Each identifier can be represented by n bits. r Require each key to be an integer in same range. r To get integer keys, hash original key.  e.g., key = h(“Led Zeppelin IV”)  This is why database is called a distributed “hash” table

29 29 How to assign keys to peers? r Central issue:  Assigning (key, value) pairs to peers. r Rule: assign to the peer that has the ID closest to key. r Convention in lecture: closest is the immediate successor of the key (or equal to) r Example: 4 bits; peers: 1,3,4,5,8,10,12,14;  key = 13, then successor peer = 14  key = 15, then successor peer = 1

30 Consistent Hashing r For n = 6, # of identifiers is 64. r The following DHT ring has 10 nodes and stores 5 keys. r The successor of key 10 is node 14.

31 31 1 3 4 5 8 10 12 15 Circular DHT (1) r Each peer only aware of immediate successor and predecessor.

32 32 Circle DHT (2) 0001 0011 0100 0101 1000 1010 1100 1111 Who’s resp for key 1110 ? I am O(N) messages on avg to resolve query, when there are N peers 1110 Define closest as closest successor

33 33 Circular DHT with Shortcuts r Each peer keeps track of IP addresses of predecessor, successor, and short cuts. r Reduced from 6 to 3 messages. r Can design shortcuts such that O(log N) neighbors per peer, O(log N) messages per query 0001 0011 0100 0101 1000 1010 1100 1111 Who’s resp for key 1110?

34 Scalable Key Location – Finger Tables 0 4 26 5 1 3 7 1 2 4 1 3 0 finger table startsucc. keys 1 235235 330330 finger table startsucc. keys 2 457457 000000 finger table startsucc. keys 6 0+2 0 0+2 1 0+2 2 For. 1+2 0 1+2 1 1+2 2 For. 3+2 0 3+2 1 3+2 2 For.

35 35 Peer Churn To handle peer churn, require each peer to know the IP address of its two successors Each peer periodically pings its two successors to see if they are still alive Limited solution for single join or single failure

36 Consistent Hashing – Node Join 0 4 26 5 1 3 7 keys 1 2 7 5

37 Consistent Hashing – Node Dep. 0 4 26 5 1 3 7 keys 1 2 6 7

38 Problem of DHT r No good solution to maintain both scalable and consistent finger table under Churn. r Solution of BitTorrent  Maintain trackers (servers) as DHT, which are more reliable  Users queries trackers to get the locations of the file  File sharing are not structured. 38

39 39 DNS vs. DHT DNS r provides a host name to IP address mapping r relies on a set of special root servers r names reflect administrative boundaries r is specialized to finding named hosts or services DHT r can provide same service: Name = key, value = IP r requires no special servers r imposes no naming structure r can also be used to find data objects that are not tied to certain machines

40 40 End of Lecture05


Download ppt "1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)"

Similar presentations


Ads by Google