Topics in Reliable Distributed Systems 048961 Fall 2003-2004 Dr. Idit Keidar.
Published byModified over 5 years ago
Presentation on theme: "Topics in Reliable Distributed Systems 048961 Fall 2003-2004 Dr. Idit Keidar."— Presentation transcript:
Topics in Reliable Distributed Systems 048961 Fall 2003-2004 Dr. Idit Keidar
Lecture 2: Introduction to Peer-to-Peer (P2P) Lookup Systems with Chord as an Example
What Does Peer-to-Peer Mean? Characterization from CFP of IPTPS 2004: –decentralized, –self-organizing –distributed systems, –in which all or most communication is symmetric.
Typical Characteristics Lots of nodes (e.g., millions). Dynamic: frequent join, leave, failure. Little or no infrastructure. –No central server. Communication possible between every pair of nodes (cf. the Internet). All nodes are “peers” – have same role; don’t have lots of resources.
The Main Challenge To design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains. “Looking up Data in P2P Systems”, Balakrishnan et al., CACM Feb 2003.
It All Started with Lookup Goal: Make billions of objects available to millions of concurrent users. –E.g., music files. Need a mechanism to keep track of them. –Map files to their locations. First There was Napster. –Centralized server/database. –Pros and cons?
Traditional Scalability Solution Hierarchy. –Tree overlay. –Structured lookup. –E.g., DNS. Pros and cons?
Symmetric Lookup Algorithms All nodes were created equal. No hierarchy. –Overlay not a tree. So how does the search go?
Searching in Symmetric Overlays Take I: The Gnutella* Approach Build a symmetric unstructured overlay. –Each node has several neighbors; –has several keys in its local database. When asked to find a key X –Check local database if X is known. –If yes, return, if not, ask your neighbors. What is the communication pattern? Problems? *More or less.
Take II: FastTrack / KaZaA Improve scalability by re-introducing a hierarchy. –Though not a tree. –Super-peers have more resources, more neighbors, know more keys. –Search goes through super-peers. Pros and Cons?
Structured Lookup Overlays Many recent academic systems – –CAN, Chord, D2B, Kademlia, Koorde, Pastry, Tapestry, Viceroy, … Symmetric, no hierarchy. Decentralized self management. Structured overlay – data stored in a defined place, search goes on a defined path. Implement Distributed Hash Table (DHT) abstraction.
Distributed Hash Tables (DHTs) Nodes store table entries. Good abstraction for lookup? Why? Requirements for being able to use DHTs? –Data identified with unique keys. –Nodes can store keys for each other. Location of object or actual object.
The DHT Service Interface lookup( key ) returns the location of the node currently responsible for this key. key usually numeric (in some range).
Using the DHT Interface How do you publish a file? How do you find a file?
What Does a DHT Implementation Need to Do? Map keys to nodes. Route a request to the appropriate node.
Lookup Example K V insert (K 1,V 1 ) K V (K 1,V 1 ) lookup(K 1 )
Mapping Keys to Nodes Goal: load balancing. Typical approach: –Give an m-bit identifier to each node and each key (e.g., using SHA-1 on the key, IP address) –Map key to node whose id is “close” to the key (need distance function). –How is load balancing achieved?
Routing Issues Each node must be able to forward each lookup query to a node closer to the destination. Maintain routing tables adaptively. –Each node knows some other nodes. –Must adapt to changes (joins, leaves, failures). –Goals?
Chord Stoica, Morris, Karger, Kaashoek, and Balakrishnan
Chord Logical Structure m-bit ID space, numerical distance mod m. Think of nodes as organized in a logical ring according to their IDs. N1 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 K54
Assigning Keys to Nodes Key k assigned to first node whose ID equals or follows k – successor(k). N1 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 K54
Moving Keys upon Join/Leave When a node joins, it becomes responsible for some keys previously assigned to its successor –Local change. –Assuming load balance, should move O(1/N) keys. And what happens when a node leaves?
Simple Routing Solutions Each node knows only its successor. –Routing around the circle. Each node knows all other nodes. –O(1) routing.
Chord Skiplist Routing Each node has “fingers” to nodes ½ way around the ID space from it, ¼ the way… Entry i at n contains successor(n+2 i-1 ). N0 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 How many fingers in the finger table?
Forwarding Queries Query for key k is forwarded to finger with highest ID not exceeding k. K54 Lookup(K54) N0 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 How long does lookup take?
Chord Data Structures Finger table. Predecessor. Interval – list of (O(log N)) successors.
Koorde O(log N) routing with 2 fingers only. –Will be presented.
Joining Chord Find your successor. Learn the keys that you are responsible for. Initialize finger table Notify other nodes that need to change their finger table and predecessor pointer.
Failure Handling Periodically probing predecessors, successors. Interval in addition to fingers. Redundant paths.