Presentation is loading. Please wait.

Presentation is loading. Please wait.

Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu

Similar presentations


Presentation on theme: "Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu"— Presentation transcript:

1 Community Detection in Bipartite Networks Using a Noisy Extremal Optimization Algorithm
Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu Babeș-Bolyai University Cluj, Romania ISDA2016

2 Introduction the community structure detection problem consists in finding groups of nodes within a network that are more linked to each other than to other nodes; currently there does not exist a consensus regarding the formal definition of a community; regarding bipartite networks, there is a general assumption that using methods designed for general ones will not be efficient due to the particular bipartite structure; there are many real-world applications that require the analysis of a bipartite network: in IP networks, bank-firm credit networks, software package refactoring, protein interaction networks, acupoint compatibility, etc.

3 Goal we explore the direct use of an optimization method called Noisy Extremal Optimization (NoisyEO), designed for community structure detection in unweighted, undirected networks, to identify communities in bipartite networks by maximizing the modularity

4 Community structure detection - method
For a general graph one of the most popular fitness function is the modularity introduced by Newman and Girvan: where m is the number of edges, 𝑚 𝐶 is the number of edges connecting vertices in community C, and 𝐷 𝐶 is the sum of the degrees of all vertices in community C. A higher modularity value indicates a better solution. The modularity 𝑄 𝑁 basically computes the difference between the community structure and the corresponding sets of nodes in a random network that does not present a community structure.

5 Community structure detection - method
A bipartite network is a graph G=(R,B,E), where R and B are two disjoint types of nodes, 𝑅∩𝐵=Φ, and there is no edge between any node from $R$ to a node from $B$. Because the modularity 𝑄 𝑁 is computed with respect to a random network, for bipartite networks Barber modified the formula to take into account a random bipartite network: where m is the number of edges, 𝑚 𝐶 is the number of edges connecting vertices in community C, 𝑅 𝐶 is the sum of the degrees of vertices from R in community C, 𝐵 𝐶 is the sum of the degrees of vertices from B in community C.

6 Community structure detection - method
we are exploring the use of both modularity measures within an extremal optimization algorithm designed for solving the community structure detection problem in unweighted and undirected networks, called Noisy Extremal Optimization (NoisyEO) NoisyEO has proven to be a robust optimization technique, mainly because of the extremal optimization component which maximizes the fitness function of each node computed as the contribution of the node to the fitness of its community.

7 Community structure detection - method
The fitness f(C) of community C is: where 𝛼 is a parameter controlling the community size. The node's contribution to a community is then: where 𝐶 𝑖 represents the community of player i, and 𝐶 𝑖 \{𝑖} is the same community without node i.

8 Community structure detection - method
The extremal optimization algorithm evolves pairs of individuals searching the space independently. An individual is encoded as a vector with n components, each component representing a network node that is assigned a community number. Each iteration the current individual is evaluated and the nodes having the worst payoffs are randomly reinitialized. The second individual preserves the best solution found by the current individual. In order to decide whether the current individual is better than the best solution found so far any fitness function can be used. We will test both modularity measures (Newman and Barber) within NoisyEO in order to explore its behavior when dealing with bipartite networks.

9 Numerical experiments
Southern woman dataset - a bipartite network with 32 nodes and 89 edges; Club membership network - 40 nodes and 95 edges; Corporate leadership - 44 nodes and 99 edges; American revolution nodes and 160 edges;

10 Numerical experiments
Parameter setting: population size: 30; number of generations: 10000; expected minimum and maximum number of communities: 2 and 8; number of iterations between shifts: 45;

11 Numerical experiments
Results obtained for each network over 30 independent runs. The modularity that is maximized is underlined; for each run both modularities are represented. Thus, for example, in the top-left, the first two boxplots represent the results reported by NoisyEO when maximizing the Barber modularity and the values of the Newman modularity for the reported solutions. The green boxplot in the middle represents the Barber modularity values for the results reported by NoisyEO when no modularity function is used.

12 Numerical experiments
Comparision with other methods: In the case of the southern woman network, most algorithms report the number of communities they identify: 2, 3 or 4. The BRIM algorithm reports for this network also 4 communities with a modularity equal to Doreian et al. reports a modularity value equal to and 3 communities. Communities detected by NoisyEO obtains 8 communities.

13 Numerical experiments
Southern woman dataset Barber modularity =

14 Numerical experiments
Southern woman dataset Newman modularity =

15 Conclusions In this paper we are exploring the use of a community structure detection algorithm designed for unweighted and undirected networks for finding the structure of bipartite networks. By simply replacing the Newman modularity with the Barber modularity we find promising results

16 Thank you for the attention!
Questions:


Download ppt "Noémi Gaskó, Rodica Ioana Lung, Mihai Alexandru Suciu"

Similar presentations


Ads by Google