Presentation on theme: "Kademlia: A Peer-to-peer Information System Based on the XOR Metric."— Presentation transcript:
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Lookup Problem Given a data item X stored at some dynamic set of nodes in the system, find it. This problem is an important one in many distributed systems, and is the critical common problem in P2P systems.
DHT Distributed hash table Skip-list like routing: Chord Multiple Dimensions routing: CAN Tree like routing: Kademlia
Kademlia For each nodes, files, key words, deploy SHA-1 hash into a 160 bits space. – For eMule, deploy a MD4 digestion(128 bits). Every node maintains information about files, key words “close to itself”. The closeness between two objects measure as their bitwise XOR interpreted as an integer. D(a, b) = a XOR b
Nodes in Kademlia
Files in Kademlia
Key words in Kademlia Hash of “break” only!
Kademlia Binary Tree Treat node as leaves in a binary tree. Start from root, for any given node, dividing the binary tree into a series of successively lower subtree that don’t contain the node.
Kademlia Binary Tree Subtrees of a node 0011……
Kademlia Binary Tree For any given node, it keeps touch at least one node (up to k) of its subtrees. (if there is a node in that tree.) Each subtree possesses a k-bucket. Every node keeps a list of (IP, Port, Node id) triples, and (key, value) tuples for further exchanging information with others.
Kademlia Search When node 0011…… wants search 1110……
Kademlia Protocol PING: to test whether a node is online STORE: instruct a node to store a key FIND_NODE: takes an ID as an argument, a recipient returns k closest nodes from its current knowledge. FIND_VALUE: behaves like FIND_NODE, unless the recipient received a STORE for that key, it just the stored value.
Kademlia Lookup The most important is to locate the k closest nodes to some given node ID. Kademlia employs a recursive algorithm for node lookups. The lookup initiator starts by picking a nodes from its closest non-empty k-bucket. The initiator then sends parallel, asynchronous FIND_NODE to the α nodes it has chosen. α is a system-wide concurrency parameter, such as 3.
Kademlia Lookup the initiator resends the FIND_NODE to nodes it has learned about from previous RPCs. If a round of FIND_NODES fails to return a node any closer than the closest already seen, the initiator resends the FIND_NODE to all of the k closest nodes it has not already queried. The lookup could terminate when the initiator has queried and gotten responses from the k closest nodes it has seen.
Kademlia Keys Store To store a (key,value) pair, a participant locates the k closest nodes to the key and sends them STORE RPCs. Additionally, each node re-publishes (key,value) pairs as necessary to keep them alive, For Kademlia’s current application (file sharing), we also require the original publisher of a (key,value) pair to republish it every 24 hours. Otherwise, (key,value) pairs expire 24 hours after publication.
P2P Open Questions Operation cost – As low as other popular protocols – Look up, O(logN) – Join or leave, O(log 2 N) Fault tolerance and concurrent change – Handle well, for the use of k-buckets Proximity routing – Kademlia can choose from α nodes that has lower latency Indexing and keyword search
Reference Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica. Looking up data in P2P systems. Communications of the ACM, Volume 46, Issue 2 (February 2003), SPECIAL ISSUE: Technical and social components of peer-to-peer computing Pages: 43 – 48 Maymounkov, P., and Mazieres, D. Kademlia: A peer-to-peer information system based on the XOR metric. In Proceedings of the 1st International Workshop on Peer-to-Peer Systems, Springer-Verlag version, Cambridge, MA (Mar. 2002); kademia.scs.cs.nyu.edu. Stoica, I., et al. Chord: A scalable peer-to-peer lookup service for Internet applications. In Proceedings of ACM SIGCOMM, San Diego (August 2001);