Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.

Peer To Peer Distributed Systems Pete Keleher

Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff –things with sensors –things that print –things that go boom –other people l Fault tolerance! –Don’t want one tsunami to take everything down

Why Peer To Peer Systems? l What’s peer to peer?

(Traditional) Client-Server Server Clients

Peer To Peer –Lots of reasonable machines No one machine loaded more than others No one machine irreplacable!

Peer-to-Peer (P2P) l Where do the machines come from? –“found” resources SETI @ home BOINC –existing resources computing “clusters” (32, 64, ….) l What good is a peer to peer system? –all those things mentioned before, including Storage: files, MP3’s, leaked documents, porn …

The lookup problem Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Key=“title” Value=MP3 data… Client Lookup(“title”) ?

Centralized lookup (Napster) Publisher@ Client Lookup(“title”) N6N6 N9N9 N7N7 DB N8N8 N3N3 N2N2 N1N1 SetLoc(“title”, N4) Simple, but O(N) states and a single point of failure Key=“title” Value=MP3 data… N4N4

Flooded queries (Gnutella) N4N4 Publisher@ Client N6N6 N9N9 N7N7 N8N8 N3N3 N2N2 N1N1 Robust, but worst case O(N) messages per lookup Key=“title” Value=MP3 data… Lookup(“title”)

Routed queries (Freenet, Chord, etc.) N4N4 Publisher Client N6N6 N9N9 N7N7 N8N8 N3N3 N2N2 N1N1 Lookup(“title”) Key=“title” Value=MP3 data… Bad load balance.

Routing challenges l Define a useful key nearness metric. l Keep the hop count small. –O(log N) l Keep the routing tables small. –O(log N) l Stay robust despite rapid changes.

Distributed Hash Tables to the Rescue! l Load Balance: Distributed hash function spreads keys evenly over the nodes (Consistent hashing). l Decentralization: Fully distributed (Robustness). l Scalability: Lookup grows as a log of number of nodes. l Availability: Automatically adjusts internal tables to reflect changes. l Flexible Naming: No constraints on key structure.

What’s a Hash? l Wikipedia: any well-defined procedure or mathematical function that converts a large, possibly variable-sized amount of data into a small datum, usually a single integer l Example: Assume: N is a large prime ‘a’ means the ASCII code for the letter ‘a’ (it’s 97) H(“pete”) = = (H(“pe”) x N + ‘t’) x N + ‘e’ = 451845518507 H(“pet”) x N + ‘e’ H(“pete”) mod 1000 = 507 H(“peter”) mod 1000 = 131 H(“petf”) mod 1000 = 986 H(“pete”) mod 1000 = 507 H(“peter”) mod 1000 = 131 H(“petf”) mod 1000 = 986 It’s a deterministic random number generator!

Chord (a DHT) l m-bit identifier space for both keys and nodes. l Key identifier = SHA-1(key). l Node identifier = SHA-1(IP address). l Both are uniformly distributed. l How to map key IDs to node IDs?

Consistent hashing [Karger 97] N32 N90 N105 K80 K20 K5 Circular 7-bit ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID

Basic lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

“Finger table” allows log(N)-time lookups N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 Every node knows m other nodes in the ring

Finger i points to successor of n+2 i-1 N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 112 N120 Each node knows more about portion of circle close to it

Lookups take O(log(N)) hops N32 N10 N5 N20 N110 N99 N80 N60 Lookup(K19) K19

Joining: linked list insert N36 N40 N25 1. Lookup(36) K30 K38 1. Each node’s successor is correctly maintained. 2. For every key k, node successor(k) is responsible for k.

Join (2) N36 N40 N25 2. N36 sets its own successor pointer K30 K38 Initialize the new node finger table

Join (3) N36 N40 N25 3. Set N25’s successor pointer Update finger pointers of existing nodes K30 K38

Join (4) N36 N40 N25 4. Copy keys 26..36 from N40 to N36 K38 K30 Transferring keys

Stabilization Protocol l To handle concurrent node joins/fails/leaves. l Keep successor pointers up to date, then verify and correct finger table entries. l Incorrect finger pointers may only increase latency, but incorrect successor pointers may cause lookup failure. l Nodes periodically run stabilization protocol. l Won’t correct a Chord system that has split into multiple disjoint cycles, or a single cycle that loops multiple times around the identifier space.

Take Home Points l Hash used to uniformly distribute data, nodes across a range. l Random distribution balances load. l Awesome systems paper: –identify commonality across algorithms –restrict work to implementing that one simple abstraction –use as building block

Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.

Similar presentations

Presentation on theme: "Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.

Similar presentations

Presentation on theme: "Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff."— Presentation transcript:

Similar presentations

About project

Feedback