# Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University.

## Presentation on theme: "Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University."— Presentation transcript:

Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University of Michigan Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

Sub-topics for today A little step back.. Background and motivation The Algorithm presented The good, the bad, the ugly ( advantages and drawbacks discussion ) Applications Summary Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

A little step back... Edge-betweenness of an edge is the number of shortest paths between pairs of nodes that run along it. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) 1 0 2 3 4 5 A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

A little step back... Quality function Q: The fraction of within-community edges minus the expected value of the same quantity for randomized network (edges fall at random with no regard to community structure) Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Background and motivation Community structure in networks is of increasing interest. Tendency to devide into tightly-knit groups: Inner edges? Many. Between-group edges? A lot less. Enter the Girvan and Newman algorithm. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

The Girvan And Newman Algorithm The betweenness of all existing edges in the network is calculated. The edge with the highest betweenness is removed. The betweenness of all edges affected by the removal is recalculated. Steps 2 and 3 are repeated until no edges remain. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

The Girvan And Newman Algorithm 1 0 2 3 4 5 7 8 9 6 24 9 3 1 1 1 1 1 1 1 1 1 0321458769 DENDROGRAM As we move down the tree, we see the partitioning of groups. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Background and motivation The G&N algorithm presented runs in worst case O(m^2n), or O(n^3) on a sparse graph. This limits us to networks with only thousands of nodes. Skype: 300 million users. Whatsapp: 450 million users. Twitter: 243 million active users (monthly). Facebook: 1.23 billion (!!!) users. So obviously, we need to find a better solution. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

The Algorithm presented The quality function Q presented earlier indicates whether a division is meaningful. Why not use it? Optimize Q over all possible divisions and find the best one! The Problem is that doing this, in a straight-forward manner, will take an exponential amount of time. A possible solution is a greedy implementation. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

The Algorithm presented Initially, each of the n nodes is a sole member of its own community. We join communities together in pairs iteratively. On each step, we choose the join that gives the largest increase (or smallest decrease) in Q. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

The Algorithm presented A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) AB CD AB CD Q = eij + eji 2aiaj = 2(eij aiaj) Singleton communities (a=1, b=2, c=3, d=4) Join (4 choose 2 = 6 options), best 1U2 (a,b=1, c=2, d=3) Join (3 choose 2 = 3) maximal, best 2U3 (a,b=1, c,d=2) Further partitioning is negative.

The Algorithm presented 1 0 2 3 4 5 7 8 9 6 0321458769 DENDROGRAM As the *algorithm iterates, we get a partition of the graph. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary 0 1 2 3 4 5 6 7 8 9 0 0 1 1 0 0 0 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 2 1 1 0 0 1 0 1 0 0 0 3 0 0 0 0 1 1 0 0 0 0 4 0 0 1 1 0 1 0 0 0 0 5 0 0 0 1 1 0 0 0 0 0 6 0 0 1 0 0 0 0 1 0 0 7 0 0 0 0 0 0 1 0 1 1 8 0 0 0 0 0 0 0 1 0 1 9 0 0 0 0 0 0 0 1 1 0 * Algorithm implementation from: http://www.elemartelot.org/ Erwan Le Martelot

The Algorithm presented Operates on completely different principles than the G&N algorithm. Agglomerative. Runs in worst case O((m+n)n) or O(n^2) on sparse graphs. Completes in a reasonable time on a network with a million vertices. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Advantages and Drawbacks Gives generally good divisions. Typically, when executed is a lot faster then G&N. THOUSANDS OF TIMES FASTER THEN G&N. Usually not better then G&N at correctly identifying communities. Why? Because our algorithm makes desicions based on local information. G&N actively analyzes the entire network. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Applications Random graphs of n=128 vertices devided into 4 groups of 32, with varying avg Zin and Zout values for vertices, where Zin+Zout=16. G&N generally performs better, although usually only by ~1% identification difference. On high Zin, new algorithm wins. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Applications Real world networks Zachary Karate Club. Similar performance to G&M. American college Football teams. G&M wins by points on accuracy. New algorithm is faster. Callaboration between physicists. New algorithm wins by knockout on speed 42 minutes VS estimated 3-5 years. Results correlate to human observence. Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

Summary The new algorithm is faster and pretty accurate, although not as G&N. Allows us to study much larger systems than previously possible. For smaller networks G&N. For larger networks new algorithm. As youll see in the next presentation, there is always room for improvement Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman) A little step back.. | Background | The Algorithm | Advantages & Drawbacks | Applications | Summary

THANK YOU! Advanced Topics in on-line Social Network Analysis - 2014\Spring Fast algorithm for detecting community structure in networks (M. E. J. Newman)

Download ppt "Fast algorithm for detecting community structure in networks M. E. J. Newman Department of Physics and Center for the Study of Complex Systems, University."

Similar presentations