UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä 3.9.2004.

UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä 3.9.2004 Mikko Vapa, researcher student InBCT 3.2 Cheese Factory / P2P Communication Agora Center http://tisu.it.jyu.fi/cheesefactory

UNIVERSITY OF JYVÄSKYLÄ 2004 Contents

UNIVERSITY OF JYVÄSKYLÄ 2004 Resource Discovery Problem In peer-to-peer (P2P) resource discovery problem a P2P node decides based on local knowledge which neighbors would be the best targets (if any) for the query to find the needed resource A good solution locates the predetermined number of resources using minimal number of packets

UNIVERSITY OF JYVÄSKYLÄ 2004 NeuroSearch NeuroSearch resource discovery algorithm uses neural networks and evolution to adapt its behavior to given environment –neural network for deciding whether to pass the query further down the link or not –evolution for breeding and finding out the best neural network in a large class of local search algorithms Query Forward the query Neighbor Node

UNIVERSITY OF JYVÄSKYLÄ 2004 NeuroSearch’s Inputs The internal structure of NeuroSearch algorithm Multiple layers enable the algorithm to express non-linear behavior With enough neurons the algorithm can universally approximate any decision function

UNIVERSITY OF JYVÄSKYLÄ 2004 NeuroSearch’s Inputs Bias is always 1 and provides means for neuron to produce non- zero output with zero inputs Hops is the number of links the message has gone this far Neighbors (also known as currentNeighbors or MyNeighbors) is the amount of neighbor nodes this node has Target’s neighbors (also known as toNeighbors) is the amount of neighbor nodes the message’s target has Neighbor rank (also known as NeighborsOrder) tells target’s neighbor amoun related to current node’s other neighbors Sent is a flag telling if this message has already been forwarded to the target node by this node Received (also known as currentVisited) is a flag describing whether the current node has got this message earlier

UNIVERSITY OF JYVÄSKYLÄ 2004 NeuroSearch’s Training Program The neural network weights define how neural network behaves so they must be adjusted to right values This is done using iterative optimization process based on evolution and Gaussian mutation Define the network conditions Define the quality requirements for the algorithm Create candidate algorithms randomly Select the best ones for next generation Breed a new population Finally select the best algorithm for these conditions Iterate thousands of generations

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Environment The peer-to-peer network being tested contained: –100 power-law distributed P2P nodes with 394 links and 788 resources –Resources were distributed based on the number of connections the node has meaning that high-connectivity nodes were more likely to answer to the queries –Topology was static so nodes were not disappearing or moving –Querier and the queried resource were selected randomly and 10 different queries were used in each generation (this was found to be enough to determine the overall performance of the neural network) Requirements for the fitness function were: –The algorithm should locate half of the available resources for every query (each obtained resource increased fitness 50 points) –The algorithm should use as minimal number of packets as possible (each used packet decreased fitness by 1 point) –The algorithm should always stop (stop limit for number of packets was set to 300)

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Environment

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases - Fitness Fitness value determines how good the neural network is compared to others Even smallest and simplest neural networks manage to have fitness value over 10000 Fitness value is calculated for poor NeuroSearch as following: Fitness = 50 * replies – packets = 50*239 – 1290 = 10660 Note: Because of bug Steiner tree does not locate half of replies and thus gets a lower fitness than HDS

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases – Random Weights 10 million new neural networks were randomly generated It seems that over 16000 fitness values cannot be obtained purely by guessing and therefore we need optimization method

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases - Inputs Different inputs were tested individually and together to get a feeling what inputs are important Using Hops we can for example design rules: ”I have travelled 4 hops, I will not send further”

UNIVERSITY OF JYVÄSKYLÄ 2004 ”Target node contains 10 neighbors, I will send further” ”Target node contains the most number of neighbors compared to all my neighbors, I will not send further”

UNIVERSITY OF JYVÄSKYLÄ 2004 ”I have received this query earlier, I will not send further” ”I have 7 neighbors, I will send further”

UNIVERSITY OF JYVÄSKYLÄ 2004 The results indicate that using only one topological information is more efficient than combining it with other topological information (the explanation for this behavior is still unclear)

UNIVERSITY OF JYVÄSKYLÄ 2004 Also the results indicate that using only one query related information is more efficient than combining it with other query related information (the explanation for this behavior is also unclear)

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases - Resources The needed percentage of resources was varied and the results compared to other local search algorithms (Highest Degree Search and Breadth-First Search) and to near-optimal search trees (Steiner) Note: Breadth-First Search curve needs to be halved because the percentage was calculated to half of resources and not all available resources

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases - Queriers The effect of lowering the amount of queriers per generation to calculate fitness value of neural network was examined It was found that the number of queriers can be dropped from 50 to 10 and still we get reliable fitness values  Speeds up the optimization process significantly

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases – Brain Size The amount of neurons on first and second layer were varied It was found that there exists many different kind of NeuroSearch algorithms

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases – Brain Size Also optimization of larger neural networks takes more time

UNIVERSITY OF JYVÄSKYLÄ 2004 Research Cases – Brain Size And there exists an interesting breadth-first search vs. depth-first search dilemma where: –smaller networks obtain best fitness values with breadth-first search strategy, –medium-sized networks obtain best fitness values with depth-first search strategy and –large-sized networks obtain best fitness values with breadth- first search strategy In overall it seems that best fitness 18091.0 can be obtained with breadth-first strategy using 5 hops with neuron size of 25:10 (25 on the first hidden layer and 10 on the second hidden layer)

UNIVERSITY OF JYVÄSKYLÄ 2004 25:10 had the greatest fitness value Would more generations than 100.000 increase the fitness when 1 st hidden layer contains more than 25 neurons? 20:10 had the greatest average hops value What happens if the number of neurons on 2 nd hidden layer is increased? Will the average number of hops decrease?

UNIVERSITY OF JYVÄSKYLÄ 2004 Summary and Future The main findings of the thesis were that: –Population size of 24 and query amount of 10 are sufficient –Optimization algorithm needs to be used, because randomly guessing neural network weights does not give good results –Individual inputs give better results than combination of two inputs (however the best fitnesses can be obtained by using all 7 inputs) –By choosing specific set of inputs NeuroSearch may imitate any existing search algorithm or it may behavior as combination of any of those –Optimal algorithm (Steiner) has efficiency of 99%, whereas the best known local search algorithm (HDS) achieves 33% and NeuroSearch 25% –Breadth-first search vs. Depth-first search dilemma exists, but no good explanation can be given yet

UNIVERSITY OF JYVÄSKYLÄ 2004 Summary and Future In addition to the problems shown this far, for the future work of NeuroSearch it is suggested that: –More inputs would be designed such that they provide useful information e.g., the number of received replies, inputs used by Highest-Degree Search algorithm, inputs that define how many forwarding decisions have already been done in the current decision round and how many are still left –Probability based output instead of threshold function could also be tested –The correct neural network architecture and the size of population could be dynamically adjusted during evolution to find an optimal structure more easily

UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä 3.9.2004.

Similar presentations

Presentation on theme: "UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä 3.9.2004."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä 3.9.2004.

Similar presentations

Presentation on theme: "UNIVERSITY OF JYVÄSKYLÄ Building NeuroSearch – Intelligent Evolutionary Search Algorithm For Peer-to-Peer Environment Master’s Thesis by Joni Töyrylä 3.9.2004."— Presentation transcript:

Similar presentations

About project

Feedback