Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu

Similar presentations


Presentation on theme: "Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu"— Presentation transcript:

1 Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu Email:naomifang@mails.ccnu.edu.cn Fang Hu, Yuhua Liu,Jianzhi Jin Website: www.ccnulyh.com Department of Computer Science, Central China Normal University, Wuhan 430079,Hubei, China

2 Outline Introduction LLE introduction and index concept definitions Multi-index evaluation algorithm based on LLE for the node importance Simulation and Analysis Conclusions 2/39

3 Introduction The single-index analysis method was evaluating node importance in complex networks by analyzing characteristic indices of nodes. such as degree centrality, betweenness centrality, closeness centrality, eigenvector centrality, mutual- information, etc. researchers proposed many evaluation algorithms based on multi-index to evaluate important nodes in complex networks. 3/39

4 LLE Introduction (1/5) Scientists interested in exploratory analysis or visualization of multivariate data face a similar problem in dimensionality reduction. Locally linear embedding (LLE) was a nonlinear dimensionality reduction technique recently presented by Roweis and Saul. 4/39

5 LLE Introduction (2/5) LLE maps its inputs into a single global coordinate system of lower dimensionality, and its optimizations do not involve local minima. LLE is a manifold learning technique which aims at mapping high-dimensional data into a low-dimensional manifold space by preserving neighbors. By exploiting the local symmetries of linear reconstructions, LLE is able to learn the global structure of nonlinear manifolds, such as those generated by facial recognition, image-processing, fault diagnosis and so on. 5/39

6 The problem involves mapping high-dimensional inputs into a low-dimensional “description” space with as many coordinates as observe modes of variability. LLE algorithm eliminates the need to estimate pairwise distances between widely separated data points, and recovers global nonlinear structure from locally linear fits. LLE Introduction (3/5) 6/39

7 LLE Introduction (4/5) Fig. 1. Example of Locally Linear Embedding 7/39

8 The problem of nonlinear dimensionality reduction, as illustrated for three-dimensional data (B) sampled from a three-dimensional manifold (A). An unsupervised learning algorithm must discover the global internal coordinates of the manifold without signals that explicitly indicate how the data should be embedded in two dimensions. The color coding illustrates the neighborhood- preserving mapping discovered by LLE; black outlines in (B) and (C) show the neighborhood of a single point. LLE Introduction (5/5) 8/39

9 Index Concept Definitions A large number of centrality measures have been proposed to identify important nodes within a graph and a complex network. Typical examples are degree centrality, closeness centrality, betweenness centrality, eigenvector centrality, and Mutual-information, etc. Many researches and experiments prove that these indices can efficiently reflect node importance in different perspectives. These indices are chosen as the parameters for multi-index evaluation in this paper. 9/39

10 Index Definition (1/5) 1.Degree Centrality The degree centrality of node, denoted as, is defined as (1) whereis the degree of node is the total as the number of ties that node has. This value is used to normalize the degree centrality value,, which is defined number of nodes. 10/39

11 Index Definition (2/5) 2.Betweenness Centrality The betweenness centralityof node,defined as (2) whereis the number of the shortest paths between node go through node, and is the number of those paths that. normalize the degree centrality value, and node This value, is defined as is used to is the total number of nodes. 11/39

12 Index Definition (3/5) 3. Closeness Centrality The closeness centralityof node,defined as (3) whereis the shortest path between node. centrality value, This value, is defined as is used to normalize the closeness is the total number of nodes. and node 12/39

13 Index Definition (4/5) 4.Eigenvector Centrality is a constant. In vector notation, this can be rewritten as For node, denotes the score of the node (4) where the eigenvector centrality score is proportional,or as the eigenvector equation matrix of the network,, to the sum of the scores of all nodes which are connected to it, i.e. is the total number of nodes, and is the adjacency. 13/39

14 Index Definition (5/5) 5.Mutual-information which represents the amount of information each The mutual-information of node is the degree of node (5) where is the sum of the mutual uses information theory to assess the importance of nodes. and other nodes which are node contains. connected to it, i.e.. information between node Mutual-information 14/39

15 multi-index evaluation algorithm based on LLE for the node importance Recently most algorithms evaluating node importance are according to single-index in complex network. Because single-index is one-sided and unstable, it is difficult to reflect the whole situation in complex network. Synthesizing multi-index factors of node importance, and applying idea of the multi-objective optimization, a new multi-index evaluation algorithm based on LLE for the node importance in complex network is proposed. In this algorithm, high-dimensional data is mapped into a low-dimensional space by preserving neighbours. 15/39

16 Steps of the Algorithm(1/6) The principle of LLE algorithm is that it is given a set of points denote points in a high dimensional space, the LLE will find a new set of coordinates in a lowdimensional space, satisfying the same neighbor-relations as the original points. 16/39

17 Steps of the Algorithm(2/6) Step 1 According to the index definitions above, the value of each index vector is calculated in complex networks and construct matrix, i.e. whereis the number of nodes,is the number of evaluation indices in complex networks. 17/39

18 Steps of the Algorithm(3/6) Step 2 For each data point find its, nearest neighbors by using Euclidean distance, which is the length of the line segment connecting two points. Step 3 Compute the weightsthat best linearly reconstruct each datafrom itsneighbors minimizing the following cost function,, (6) under the constraints that each vector of weights sums to unity. 18/39

19 Steps of the Algorithm(4/6) Step 4 Construct the optimal low dimensional embedding for, in which the local linear geometry of the high-dimensional data is best preserved by the reconstruction weights of the datain. This step is accomplished by minimizing the following cost function for the fixed weights, (7) subject to the following constraints, 19/39

20 Steps of the Algorithm(5/6) (8) To optimize the embedding error, we can rewrite it in the following quadratic form, (9) Based on inner products of the outputs. The square,matrix is a sparse, symmetric and semi-positive matrix, is given by, 20/39

21 Steps of the Algorithm(6/6) (10) for whichis an element of the identity matrix. The constrained minimization problem can be converted to solving and eigen-decomposition of the matrix as calculated below, (11) for which the eigenvectors associated with the bottom nonzero eigenvalues constitute the final embedding outputs. 21/39

22 Computable Complexity The computable complexity of calculating degree centrality and mutual-information is, where is the number of nodes in complex network; calculating eigenvector centrality is ; calculating betweenness centrality and closeness is. In LLE algorithm, choosing neighbors needs, where is the dimension of high-dimensional sample; calculating the reconstruction weights is, where is the number of neighbours; getting d-dimensional embedding is ; so the computable complexity of LLE algorithm is. Finally, based on the analysis above, the computable complexity is. 22/39

23 Simulation Example(1/7) Software: Matlab, R, VC++ Pajek, Gephi, NodeXL, SigmaPlot, UCINET SPSS, SAS, ORIGINS Windows, Linux Data Set: Real-world Networks Computer-Generated Networks Clinical data, Medical data TCM(Traditional Chinese Medicine) data 23/39

24 Simulation Example(2/7) 24/39 a. The Zachary’s Karate Club network b. The Bottlenose Dolphin network

25 Simulation Example(3/7) 25/39 c 、 The American College Football network

26 Simulation Example(4/7) 26/39 d 、 The network of interactions between major characters in the novel Les Mis´erables by Victor Hug

27 Simulation Example(5/7) 27/39 E 、 Krebs’ network of books on American politics

28 Simulation Example(6/7) In this paper, ARPA network topology is used to analyze and illustrate the multi-index evaluation algorithm based on LLE. ARPA is the trunk topology in North America, composed of 21 nodes and 26 edges. Although an arbitrary node is removed from ARPA, the network is still connected. Fig. 2. ARPA’s topology ARPA NETWORK 28/39

29 Simulation Example(7/7) Fig. 3. Karate’s topology Wayne Zachary observed social interactions between the members of a karate club at an American university. After a long study, he built the network consisting of 39 members of the karate club as nodes and 78 edges representing friendship between the members of the club. KARATE CLUB 29/39

30 Analysis (1/7) Fig. 4(a) Nodes’ importance contrast line graph according to mutual-information, closeness, and LLE in ARPA 30/39

31 Analysis (2/7) Fig. 4(b) Nodes’ importance contrast line graph according to degree, betweenness, PageRank, and LLE in ARPA 31/39

32 Analysis (3/7) The result of this proposed algorithm is that the most important nodes are, and, which is identical to the results of many single-index algorithms, such as degree centrality, eigenvector centrality and mutual-information. Betweenness centrality and closeness centrality can identify correctly that the most important node is. The secondary important nodes are,, and, which is identical to the results of eigenvector centrality, degree centrality and mutual-information. The arrangement result of betweenness centrality represents that many nodes communicate with others via ; closeness centrality represents that is closer to the center of network. 32/39

33 Analysis (4/7) Fig. 5 Nodes’ importance contrast line graph according to PCA and LLE in ARPA 33/39

34 Analysis (5/7) Fig. 6(a) Nodes’ importance contrast line graph according to mutual-information, closeness, and LLE in Karate club 34/39

35 Analysis (6/7) Fig. 6(b) Nodes’ importance contrast line graph according to degree, betweenness, PageRank, and LLE in Karate club 35/39

36 Analysis (7/7) The result of this proposed algorithm identifies the most five important nodes are,,, and, in which the and are generally considered as the most two important nodes and usually as the core nodes to detecting communities in complex networks. This result of this new algorithm is identical to the results of the degree centrality, betweenness centrality, eigenvector centrality and mutual-information. The closeness centrality can not identical the most two important nodes accurately, because the node is a wrong identification. 36/39

37 Simulation And Analyze Through the simulation and analysis above, it represents that the result acquired from this proposed algorithm is basically identical, or to some extent, is more careful and reasonable than other single-index algorithms and multi-index evaluation algorithm based on PCA. 37/39

38 Conclusions In this paper, a new multi-index evaluation algorithm based on Locally Linear Embedding for node important in complex networks is proposed, which is simple and effective, and synthesizes the statistic characteristics of nodes in complex network. This proposed algorithm maps high-dimensional data into a low-dimensional space by preserving neighbors. By example analysis and simulation experiments, it shows that this algorithm can effectively reflect the differences of node importance, accurately and efficiently find the important node in complex network. This method can be extended to directed and weighted networks in future work. 38/39

39 Date: 26/11/2014 39/39


Download ppt "Multi-index Evaluation Algorithm Based on Locally Linear Embedding for the Node importance in Complex Networks Fang Hu"

Similar presentations


Ads by Google