Presentation on theme: "Presenter: Guoliang Liu Date:4/25/2012. Background Introduction Definition Basic idea of partition Quality Function Classification Based On Algorithms."— Presentation transcript:
Background Introduction Definition Basic idea of partition Quality Function Classification Based On Algorithms Benchmarks Applications Conclusion
Real-world networks: Biological Network Social Network Technical network … The small world effect Transitivity Degree distributions Network resilience Mixing patterns Degree correlations Community structure Network navigation Others such as Self- similarity
What is Community Structure? Fig.1(Girvan and Newman, 2004)
Definition of Community Structure Note: no universally accepted definition. Basic idea: Random graph does not have community structure. As to non-random graph, there must be more edges inside the community than edges linking vertices of the community with the rest of the graph. Maximize
How can we do partition in the graph? Partitions can be hierarchically ordered, when the graph has different scales. In this case, clusters display in turn community structure, with smaller communities inside, which may contain smaller communities and so on. Hierarchical Structure
Most well-known Metric: Modularity: How different is this graph from a random graph
Classification Hierarchical Structure Divisive Algorithms Foundation work: Girvan and Newman,2002/2004 Modification of GN divisive method Representation Work: Tyler et al., 2003; Wilkinson and Huberman, 2004;Zhou et al., 2006; Chen and Yuan, 2006.; Rattigan et al., 2007; Pinney and Westhead, 2006 Agglomerative Algorithms Foundation work: Newman, 2004 Modification of GN agglomerative and new proposed agglomerative methods Representation Work: Zhenqing Ye et al, 2008 Vincent D. Blondel et al,2009 Nam P. Nguyen et al,2011 Non-hierarchical Constructive Algorithms Relatively new approaches Representation Work: D.Shah and T.Zaman,2010; R.R.Khorasgani et al, 2010 Rushed Kanawati, 2011 Optimization approach Representation Work: S.Li, et al, 2010 D.Jin et al,2010 C.Shi et al, 2010 Thang N.Dinh et al,2011
Hierarchical Structure: Fig.2(Girvan and Newman, 2004) AgglomerativeDivisive
Foundation work (Girvan and Newman, 2004) 1. Calculate betweenness scores for all edges in the network. 2. Find the edge with the highest score and remove it from the network. 3. Recalculate betweenness for all remaining edges. 4. Repeat from step 2. How to measure betweenness Shortest path Random walk Current-flow
Foundation work(Newman 2004) Based on modularity Q At first, treat each node as a single community. Calculate Modularity of each pair of two neighboring communities. Find the largest gain of Modularity and merge this two communities to one. Iteratively do the second step, until we get only one community. Find the largest Modularity in some level
Fast unfolding of communities in large networks (Vincent D. Blondel,2009) Modification of Newman fast algorithm,2004. Take use of another property of complex networks: Self-similarity (Treat each community as a single node). Different from Newman 2004, every iteration treats each community as a single node. Advantages: Much faster when calculating modularity of each merged communities.
GN benchmark(Girvan and Newman, 2004) Derived from planted l-partition model Benchmark Graphs consist of 128 nodes with expected degree 16, which are divided into four groups of size 32 each.
LFR benchmark Compared with GN benchmark, LFR benchmark takes degree distribution with power law principle into account, which is another property of complex networks. Hence, LFR benchmark is more practical to test detection algorithms.
More information about benchmark : http://www.cs.gsu.edu/~gliu6/courseCSC8530.h tml
Clustering Web clients: users who have similar interests and are geographically near to each other may improve the performance of services provided on the World Wide Web Clusters of large graphs: can be used to create data structures in order to efficiently store the graph data and to handle navigational queries, like path searches Data dissemination in Mobile social networks: How to find most influential nodes. Processors allocation in parallel computing: it is crucial to know what is the best way to allocate tasks to processors so as to minimize the communications between them and enable a rapid performance of the calculation.
Community detection has been studied for a long time and since real-world complex networks development, community detection is still a popular topic in all kinds of fields such as economy, physics and computer science.
 M. Girvan and M. E. J. Newman, Community structure in social and biological networks,[ Proc. Nat. Acad. Sci. USA, vol. 99, no. 12, pp. 7821–7826, Jun. 11, 2002.  M. E. J. Newman and M. Girvan, Finding and evaluating community structure in networks,[ Phys. Rev. E, vol. 69, 026113, 2004.  M. E. J. Newman, Fast algorithm for detecting community structure in networks, [Phys. Rev. E, vol. 69, 066133, 2004.  A. Clauset, M. E. J. Newman, and C. Moore, Finding community structure in very large networks, [ Phys. Rev. E, vol. 70, 066111, 2004.  V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, Fast unfolding of communities in large networks,[ J. Stat. Mech., Theory Exp., 2008, DOI: 10.1088/ 1742-5468/2008/10/P10008.  M. Saravanan, G. Prasad, K. Surana, and D. Suganthi, Labeling communities using structural properties,[ in Proc. Int. Conf.Adv. Social Netw. Anal. Mining, Aug. 2010, pp. 217–224.  Kanawati, R. ; LICOD: Leaders Identification for Community Detection in Complex Networks, Privacy, security, risk and trust (passat), 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), P577 – 582, 9-11 Oct. 2011.  Nguyen, N.P. ; Dinh, T.N. ; Nguyen, D.T. ; Thai, M.T. ; Privacy, security, risk and trust (passat), 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), P35 - 40, 9-11 Oct. 2011.  D. Shah and T. Zaman, Community detection in networks: The leaderfollower algorithm, in Workshop on Networks Across Disciplines in Theory and Applications, NIPS, November 2010.  R. R. Khorasgani, J. Chen, and O. R. Zaiane, Top leaders community detection approach in information networks, in 4th SNA-KDD Workshop on Social Network Mining and Analysis, Washington D.C., July 2010. Nam P.Nguyen, Thang N.Dinh, Dung T.Nguyen, My T. Thai, overlapping community structures and their detection on social networks, 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom), P 35 – 40, 9-11 Oct. 2011