Presentation on theme: "Evaluation of a Scalable P2P Lookup Protocol for Internet Applications"— Presentation transcript:
1Evaluation of a Scalable P2P Lookup Protocol for Internet Applications Master Thesis PresentationEvaluation of a Scalable P2P Lookup Protocol for Internet ApplicationsHi, My name is Samer Al-Kassimi, and I’m going to present my Masters Thesis work.This thesis has been developed in SICS in collaboration with KTH under the supervision of Per Brand and Sameh El-Ansary, and with consent of Seif HaridiThe title of the thesis is “Evaluation of Chord” and it’s a research work on a Scalable Peer to peer lookup protocol for internet applications whose name is Chord.by Samer Al-KassimiSupervisors: Per Brand & Sameh El-AnsaryExaminer: Vladimir Vlassov
2Contents 1 2 3 P2P & the Chord protocol 4 Goals, Tools and Methodology Traffic generatorSimulatorExperiments & Data AnalysisConclusions & Future WorkFirst of all I’d like to outline the contents of this slide show. The presentation is divided in 4 parts: Each one of the following slides will have a number on the topmost left part of the slide indicating in which stage of the presentation we are. First I will introduce peer to peer systems and the choice of Chord for a case study. I will follow by mentioning the goals and methodology used to fulfill the purposes of the research. Later on I will explain the experiments that were run, and their results. And last, the conclusions as well as a brief commentary about possible future work.
3Peer to Peer (P2P) 1 Server Node Client Client Node Node Internet Server-ClientPeer-to-PeerServerNodeClientClientNodeNodePeer to Peer systems can be, in general and simple terms, defined as a type of networks in which the actors have equivalent capabilities and responsibilities. This differs from client/server architectures, in which some computers are dedicated to serving the others.Reliability: no central point of failuremany replicasgeographic distributionHigh capacity through parallelism:many disksmany network connectionsmany CPUsAutomatic configurationUseful in public and proprietary settingsInternetInternetClientClientNodeNode
4Weaknesses of previous P2P systems 1N4N8N6N9N7N3N2N1ClientLookup(“title”)Gnutellaunreliabilityhigh cost O(N)flooding networkNapstersingle point of failureKey=“title”Value=MP3 data…Here I show an example of a couple of P2P systems that were designed before Chord, to present their weaknesses, and justify the introduction of ChordThese are two typical networks.In Gnutella, a client makes lookup request, and the request is forwarded in the network by every node sending the request to each of its known neighbors... This lookup request travels all about the network, flooding in depth, until a node that has the requested information is found. And when a node that has this information receives the request, it replies to the original requester. This has a few drawbacks, namely: it floods the network with messages and the number of messages in the network to find something is potentially in the order of the size of the network, and there is no guarantee of reply. To solve this, the protocol states that the message is forwarded only to a maximum depth, and that the reply is expected before a certain amount of time, but that ultimately causes unreliability, due to the fact that certain information could be in the network, but was not reached because of that limit in depth.In the case of Napster, when a node enters the network, it connects to a server and yields a list of the data it has, so the database knows about all the data that all the nodes offer in the share. When a client makes a lookup request, it’s done to that very database,and the database replies with the location of the data item (if any). This, far from being a pure peer to peer system, relies on a central server, which is a single point of failure.N6N9N7DBN8N3N2N1N4SetLoc(“title”, N4)Key=“title”Value=file data…ClientLookup(“title”)
5Chord 1 Belongs to the family of “routing” algorithms Distributed hash tablesScalablemost operations cost O(log2N)RobustHandles node failures wellReliablecorrect results in the face of network changesSo, how to solve these problems? Chord is a protocol, that belongs to the family of “routing” algorithms. That means that the node keeping the data that is interesting for a requester is found by following a route or path. It is done by means of a technical solution called “distributed hash tables”. It provides scalability: for linear growths of the network, the operations offer logarithmic cost. It’s robust, it doesn’t depend on a central server, and doesn’t present a single point of failure. Furthermore, it handles massive node failures well. And it’s reliable: if a data item is in the share, it will be found, and if it’s not, the requester will know.
6Basic Topology: responsibilities and path following successors 1The successor of a key is responsible for that keyLookups follow successorsThis topology offers O(N) performance (non scalable)doc 54LOOKUP(54)N14N1N8N21N32N38N42N48N51N56doc 10INSERT (doc 24)How does Chord work? For a start, let’s focus on the drawing at the right. Basically: nodes and data items have numeric identifiers belonging to a certain range; the identifier range is the same for nodes and items. In the following examples, the identifier space will be from 0 to 63. That’s 2 to the power of 6, 64 values. The basic topology is that nodes know about their successor and their predecessor, that is, the next higher number and the next lower number to their identifier. Here we have an example of a network with 10 nodes with randomly chosen identifiers. When one node wants to public a data item, it INSERTS the data item in the network. That makes that the responsible node for that item will be the successor of the item’s identifier. Let’s think of this same network, with some more data inserted. As you can see, each data item is associated to a node with an identifier equal to or bigger than the item’s identifier, that is, the successor of the data item’s identifier. In this basic view of the network, if a node makes a lookup request, like here the lookup request is forwarded all along the network, following the successors path until the responsible for that data item is found. If the responsible has the data item, it will reply with a positive answer, and possibly with the data item. If not, the requester will know so too. Now, this has a little problem: the amount of messages needed to locate entities is potentially in the order of the size of the network, like it happened with gnutella, as shown before. Note that the path length of this example has been 1, 2, 3,doc 24doc 38doc 30
7Performance and scalability: fingers 1N14N1N8N21N32N38N42N48N51N56N14’s Finger tableicandidatenode14+20 = 1521114+21 = 16214+22 = 18314+23 = 2232414+24 = 30514+25 = 4648Solution: enhance the basic idea with data structures that will provide shorter paths to the desired goal. Add information about the network to each node, so queries are answered with fewer intermediate steps. Ideally, if every node knew about all the network, all queries would be answered with constant cost. But this would mean massive amounts of information in each node for big networks. What Chord proposes is that for a network with maximum size (that’s the identifier space) N = 26 = 64 nodes, each node should know about 6 other specific nodes in order to provide good performance; the pointers to these other nodes are called fingers. Here’s an example of the fingers of node 14. In this network example, node 14 would know about nodes 21, 32 and 48, and that’s all that node 14 needs to know for the network to behave the way we expect it to do.
8Performance and scalability: fingers (2) 1 doc 10doc 54doc 38doc 24doc 30LOOKUP(54)N14N1N8N21N32N38N42N48N51N56In the case of a lookup request within a network similar to the one in the previous example, with this new information, lookups would take shorter paths, measured in the number of intermediate hops. The path length of this very lookup, was 8 in the previous example, now it’s 3. This may not look like a dramatic improvement, but the gain is fully noticed for big networks, because for linear growth of network sizes, the path length for lookup requests is expected to grow logarithmically. Insertion of a node in the network is based in the lookup operation, so it costs about the same.
9Robustness: successors list and referrers list 1When a node fails, the network reorganizes itselfFurther data structures are neededThe Chord ring holds its propertiesN14N1N8N21N38N42N48N51N56doc 10doc 54doc 38Second issue to take into consideration: robustness: what happens when nodes fail or leave the network without notice? Chord reacts to those events by reorganizing successor and predecessor pointers as soon as the fault is found. A set of routines are called periodically to check if the predecessor is alive, and correct it if necessary. This works both for finding that the predecessor has failed, or that a new node has been inserted as our predecessor. Incidentally, fingers are also kept up to date by means of a routine that is called periodically. This is an example on the same network that we have seen so far. Every node knows about it’s successor and predecessor as well as some fingers. Here we have remarked successors and predecessors pointing from and to node 32. If node 32 fails, or leaves the network without telling anyone else, 21 points to a successor that no longer exists. The same holds for the predecessor of 38.There are some data structures in the nodes that will make it possible for node 21 to update the successor to make it point to 38, and let 38 know that it’s current predecessor is 21. One of the data structures is the successors list. In general terms, a voluntary leave operation of a node doesn’t differ too much of a node dropping from the network. Just that this update is made immediately. An additional data structure that corrects fingers has been added to the basic Chord node to improve the performance.doc 24N32doc 30
10Objectives and methodology 2Goal: case study of ChordMethodology: simulations that test scalability, robustness and reliabilityDevelopment: simulator and changes to a traffic generatorDesign of test cases and run of the simulationsNow, after this introduction about how Chord works, I am going to illustrate the work that has been done to assess Chord’s behaviour under a set of scenarios.I have programmed a simulator that helps studying a broad range of P2P systems. I have used and changed a tool that generates traffic to feed the simulator, and I have designed a set of test cases to be run in the simulator in order to study Chord’s scalability, robustness and reliability.
11The traffic generator 2 It’s a tool that generates network activity inserts nodes into the networkmakes nodes to leavemakes that nodes request lookupsSmall changes were made to the way the generator worksThe traffic generator is a tool that generates network activity in a text file that has later to be parsed by the simulator. It inserts nodes in the network, makes nodes to leave and most importantly, makes that nodes request lookups.I needed to slightly change the traffic generator to suit my purposes.It’s a tool that generates an output list of network activity depending on the parameters fed by an input text fileDue to the fact that the identifier range of nodes was variable, small changes were made to the way the generator works
12The simulator 2 Programmed in Java Internals: loop traversing all nodes at each time unit to execute eventsMessages are sent from node to nodeAdjustable verbosityExtendable propertiesCustomizable options (with XML)The simulator was originally programmed in OZ by Peep Kungas in SICS and Sameh El-Ansary helped me changing it. Later I reprogrammed it in Java. The basic idea is that there is a clock ticking, and at each time unit a loop checks all nodes to see if they have pending events to execute at that time, and messages are sent from node to node to provoke events.The simulator is very customizable and generic, it is easy to extend, and I prepared it to make it easy to change the log verbosity.
13The experiments 3 Changes in the network size is Chord scalable? network growthMassive simultaneous node failuresis it robust? make nodes drop at onceConstant node joins and departuresTest reliability: are requests correctly replied?With this, we can now explain the set of experiments that were run to take the data. There were three sets of experiments; each one of them test one different characteristic of Chord’s behaviour. The first set, “changes in the network size”, tests Chord’s scalability in networks with growing sizes. The second set of experiments, “massive simultaneous node failures” is designed to give insight on Chord’s robustness by making a randomly chosen subset of nodes drop at once. And the last set, “constant node joins and departures”, will check the protocol’s reliability.
14Changes in the network size (1) 3One common changing value:networks with size N=2K, K=3..14N = 8, 16, 32,…, 8192, nodesFour subsets of experiments, combinations of two variables:with/without successors listconstant/proportional identifier spacePath length is loggedLet’s look a little deeper into the first set of experiments. This is the layout: There are a total amount of 12 network sizes of 2 to the power of k, k being in the range 3 to 14. That means, one network of 8 nodes, another with 16 nodes, another with 32, etc...For each one of these network sizes, four experiments were run, corresponding the combination of two factors: using or not using the successors list, and whether the identifier space is constant to all experiments, or proportional to the network size.The performance parameter that is logged to draw conclusions is the path length.
15Scalability Data Analysis (Changes in network size) 3This is the Probability Density Function (PDF) of path lengths for one of the experiments run, with 8192 nodes. It has a typical bell shape, with the maximum near the average. This means that, of the 10,000 lookups, most of them were replied with a path length of around 7, and very few of them were replied with lengths near 0 or near 14. Now, to introduce the next slide, I want to compare many experiments at once. Imagine that for each one of the experiments, I chop 1% of the values off the left and 1% off the right, and then flatten the bell shape, and mark the average, and place them in another plots...
16Scalability Data Analysis (Changes in network size) 3The way these plots are interpreted is as follows: the x axis is the size of the network, and it’s a logarithmic axis. The y axis is the path length, and it’s linear, from 0 to 16.You can see that for exponential growths of the network size, the path lengths of lookups grows only linearly. Another way to say the same is that path length grows logarithmically for linear growths of the network size. If I make a network twice as big, the path length of lookup requests grows only logarithmically, that is, much less. An example: For a network with 2 to the power of 6 nodes, that is 64 nodes, the average path length is about 2. If I take a network of 2 to the power of 13, that’s 8192 (more than 100 times bigger), the path length is in average around 4.5. Less than 3 times bigger.
17Scalability Data Analysis (Changes in network size) 3The most interesting way to interpret results for this set of experiments is to compare each one of the subsets with the others. Blue plots represent experiments in which the successors list was used as an forwarding data structure, and red ones didn’t use this data structure to look for information that might provide better paths to find the result. And also, the upper plots are experiments in which the maximum ID that a node or item can have depends on the previous knowledge that the designer has to quantify the size of the network. Following the instructions of the paper, this max ID is at least about 100 times the size of the network. The lower plots correspond to networks in which the IDs go from 0 to 2 to the power of 21, that’s a little bit more than 2 millions, although the biggest network is of size 2 to the power of 14, that is, 16,384 nodes.Now, the interesting part, regarding the four subsets of experiments: how do networks compare in pairs?Let’s compare first networks that have in common the fact that the identifier space is proportional. What they differ in is that the red plot corresponds to experiments in which the successors list was not used for forwarding purposes. Note that the average path length is slightly higher when this data structure is not used. What about the same comparison, but with common identifier space for all the experiments?Again, the average path length is slightly bigger when not using the successors list.Now, let’s see what happens when we compare subsets of experiments that make the same use of the successors list. First of all, when the successors list is used, the blue plots.What we have here is that regardless of whether the identifier space is proportional or identical to all the experiments, the network behaves in very similar way, just a few differences here and there, not very significant, actually.The last comparison, the red plots, is the same concept, to see the impact of the identifier space, but now in the experiments that didn’t use the successors list as a forwarding improvement.Here we can see that the experiments were identical.So... What conclusions can be drawn from this study?
18Scalability Data Analysis (Changes in network size) 3The successors list data structure provides slight improvement in queries path lengthThe fact that the identifier space is constant or proportional to the size of the network does not affect the results, although each node has more absolute information (but not relative)Well, first, that the successors list data structure provides only a slight improvement in queries path length.And second, the fact that the identifier space is constant or proportional to the size of the network does not affect the path length measurements, although each node has more absolute information (but not relative).
19Scalability Data Analysis (Changes in network size) 3The last results I want to show about this set of experiments is a comparison of data that I gathered with a figure in the paper from which I took most of the information to complete my evaluation. The upper plot comes from that paper, and the lower corresponds the two equivalent graphs from my data. I just wanted to assess that the two sets of data are reasonably similar. The only difference at first sight is the fact that the x axis of the figure in the left is expressed in powers of base 10, and mine is base 2.
20Changes in the network size (2) 3One common changing value:networks with size N=2K, K=3..12N = 8, 16, 32,…, 2048, 4096 nodesThree subsets of experiments, in which each node issues 10, 20 and 25 document search requests respectivelyPath length and work load are logged
21Scalability Data Analysis (Process load per node) 3
22Scalability Data Analysis (Process load per node) 3kN=2kLpath#Calls10L·#lookups(L+1)·#lookups381.0120.510.120.14161.2121.7512.122.15321.7017.0276642.0820.830.871282.4635.12524.634.62562.9229.239.295123.3433.443.41010243.7937.947.91120484.2342.352.31240964.6746.756.7
23Massive simultaneous node failures 31,000 nodes0, 10, 20, 30, 40 and 50% of nodes are made to disappear at once respectively in each one of the 6 experimentsSuccessors list is of size r=2·log2NPath lengths, timeouts and reply successes are loggedSo, second set of experiments. 6 runs of the simulator with networks that initially have 1,000 nodes. To each one of them, at a certain point of time, a subset of randomly chosen nodes is made to fail altogether. From that point, 10,000 lookups are requested, and while the network reorganizes itself, the path lengths of those lookups are logged, as well as the number of timeouts (messages that don’t reach destination and have to be resent) and the correctness of the reply. Note that the length of the successors list data structure is about the double of the size of the fingers table. That value is extensively justified in the report.
24Robustness Data Analysis (Massive node failures) 3What happens, in general terms, when a bunch of nodes fail at once? If we compare the Probability Density Function of path lengths of a network in which there is no failures with another one in a network in which 30% of the nodes have failed, it can be seen that the right tail of the bell is elongated. There’s a set of requests that have taken considerably more hops to reach a reply. That’s because there was some incorrect data in many nodes of the network, and when trying to contact a node that is no longer in the network a retry was in order. Several retries have as a consequence that the path length is longer.
25Robustness Data Analysis (Massive node failures) 3When a node fails and it’s in the path of a lookup, a timeout is counted, and the lookup is retried.Differences are minorNo undershooting, there’s always a replyPath lengths and timeouts (experiments)Let’s see the data of my experiments compared to that provided by the paper previously mentioned. We can see that the average path length is more or less about the same numbers in my data as in the data found in the paper. Slightly worse, if anything, in the 99th percentiles. Regarding the number of timeouts, my data shows less timeouts, both average and percentile, but again, not dramatically.An important fact to notice is that all the requests received a reply, and that, in a network that has missed up to 50% of the nodes at once, is a remarkable feature, as well as maintaining the performance and not flooding the networkPath lengths and timeouts (paper data)
26Constant node joins and departures 31,000 nodes, size is kept “constant”Nodes continuously leave and joinfrequencies f=1/(0.5i) per second, i=1..8rates R=0.05 to 0.40 in steps of 0.05Successors list is of size r=2·log2NReferrers list (fingers that point the node)Path lengths, timeouts, reply successes and accuracy are loggedThe last set of experiments tests the network reliability under changing conditions. A network that has initially 1,000 nodes is kept constant with changing rates of nodes leaving and joining the network for the various experiments. Another important data structure that plays an important role in these experiments is the referrers listR=0.05 is nodes joining and leaving each 20 seconds in averageR=0.40 is nodes joining and leaving each 2.5 seconds in average
27Reliability Data Analysis (Constant network changes) 3Data extracted from the experiments regarding reliabilityAgain, the results for these experiments are shown in two figures, the upper is my data, and the lower is the data from the paper. Again, it can be seen that the path length data is similar, if only slightly better for my experiments, and the number of timeouts is definitely better for my data. The reason for that is that I use the referrers list, which is not used in the experiments from which this data came. The number of lookup failures is in the same order of magnitude, but lower too, which I also think is due to the use of two data structures that have network information rather than one. In my experiments I also logged a last column of data with information about how many requests were not replied. As it can be seen, only 0.01% of the lookup requests didn’t receive a reply in a highly changing environment.Data extracted from the paper
28Reliability Data Analysis (Constant network changes) 3The traffic generator adds a source of errorIt’s difficult to design the experiments with the way the traffic generator provides outputThe referrers list provides better behaviour of the network under changing conditionsChord behaves well in these scenariosAnyway, it needs to be remarked that the traffic generator proved to be a source of error for this set of experiments. Furthermore, it’s difficult to design the experiments with the way the traffic generator provides output. As we saw in the previous tables, the referrers list provides better behaviour of the network under changing conditions. As a general conclusion, we can say that Chord behaves well in these changing scenarios.
29Conclusions4Chord is a simple, and yet robust, reliable and scalable P2P protocolChord might not be the best solution for typical P2P usesThe simulator is a useful, easy-to-use, highly customizable toolThe traffic generator can be improved, some outputs are not accurateFinally: the conclusions of the thesis. (read the points)
30Conclusions (2) 4 A few points of the paper were not easy to interpret The results of the experiments reasonably match previous researchSome experiments were difficult to reproduce
31Future work4Broaden the set of uses in which P2P technology can be putImprove scalability and efficiency, offering O(logKN) instead of O(log2N)Devise more experiments to get a deeper understanding of the protocolFurther development of the simulator (GUI? Interactivity?)After the conclusions, I’d like to enumerate a few points of the thesis that are open to future work.P2P is being used nowadays for VoIP purposesDKS is offering this kind of performanceA GUI and a means to set breakpoints in the simulation to go step by step would be very desirable additions.
32Questions? Okay, then… Thank you for coming! Well, thank you all for coming, and this is the moment that I’ve been fearing lately... Questions?Okay, then…Thank you for coming!