Overlapping Community Detection in Networks

Slides:



Advertisements
Similar presentations
Social network partition Presenter: Xiaofei Cao Partick Berg.
Advertisements

ICDE 2014 LinkSCAN*: Overlapping Community Detection Using the Link-Space Transformation Sungsu Lim †, Seungwoo Ryu ‡, Sejeong Kwon§, Kyomin Jung ¶, and.
CrowdER - Crowdsourcing Entity Resolution
Charalampos (Babis) E. Tsourakakis KDD 2013 KDD'131.
Fundamental tools: clustering
Correlation Search in Graph Databases Yiping Ke James Cheng Wilfred Ng Presented By Phani Yarlagadda.
Introduction to Graph Cluster Analysis. Outline Introduction to Cluster Analysis Types of Graph Cluster Analysis Algorithms for Graph Clustering  k-Spanning.
Analysis and Modeling of Social Networks Foudalis Ilias.
Modularity and community structure in networks
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Incidences and Many Faces via cuttings Sivanne Goldfarb
P RELIMINARIES –C OMPUTATIONAL P ROBLEM Given a set of real numbers, output a sequence, ( l 1, …, l i, …, l n ), where l i ≤ l i+1 for i = 1 … n-1. Naive.
Discovering Overlapping Groups in Social Media Xufei Wang, Lei Tang, Huiji Gao, and Huan Liu Arizona State University.
© University of Minnesota Data Mining for the Discovery of Ocean Climate Indices 1 CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance.
Yuzhou Zhang ﹡, Jianyong Wang #, Yi Wang §, Lizhu Zhou ¶ Presented by Nam Nguyen Parallel Community Detection on Large Networks with Propinquity Dynamics.
Fast algorithm for detecting community structure in networks.
A scalable multilevel algorithm for community structure detection
The community-search problem and how to plan a successful cocktail party Mauro SozioAris Gionis Max Planck Institute, Germany Yahoo! Research, Barcelona.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Part I: Introductory Materials Introduction to Graph Theory Dr. Nagiza F. Samatova Department of Computer Science North Carolina State University and Computer.
The Relative Vertex-to-Vertex Clustering Value 1 A New Criterion for the Fast Detection of Functional Modules in Protein Interaction Networks Zina Mohamed.
Random Graph Models of Social Networks Paper Authors: M.E. Newman, D.J. Watts, S.H. Strogatz Presentation presented by Jessie Riposo.
Functional Module Prediction in Protein Interaction Networks Ch. Eslahchi NUS-IPM Workshop 5-7 April 2011.
Faculty: Dr. Chengcui Zhang Students: Wei-Bang Chen Song Gao Richa Tiwari.
Gene expression & Clustering (Chapter 10)
CSV: Visualizing and Mining Cohesive Subgraphs Nan Wang Srinivasan Parthasarathy Kian-Lee Tan Anthony K. H. Tung School of Computing National University.
Hao-Shang Ma and Jen-Wei Huang Knowledge and Information Discovery Lab, Dept. of Electrical Engineering, National Cheng Kung University The 7th Workshop.
Fixed Parameter Complexity Algorithms and Networks.
Multiscale Symmetric Part Detection and Grouping Alex Levinshtein, Sven Dickinson, University of Toronto and Cristian Sminchisescu, University of Bonn.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Uncovering Overlap Community Structure in Complex Networks using Particle Competition Fabricio A. Liang
1/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science
Chapter 3. Community Detection and Evaluation May 2013 Youn-Hee Han
A Local Seed Selection Algorithm for Overlapping Community Detection 1 A Local Seed Selection Algorithm for Overlapping Community Detection Farnaz Moradi,
NP-Complete Problems. Running Time v.s. Input Size Concern with problems whose complexity may be described by exponential functions. Tractable problems.
Concept Switching Azadeh Shakery. Concept Switching: Problem Definition C1C2Ck …
Topics Paths and Circuits (11.2) A B C D E F G.
Gene expression & Clustering. Determining gene function Sequence comparison tells us if a gene is similar to another gene, e.g., in a new species –Dynamic.
Ch. Eick: Introduction to Hierarchical Clustering and DBSCAN 1 Remaining Lectures in Advanced Clustering and Outlier Detection 2.Advanced Classification.
Community-enhanced De-anonymization of Online Social Networks Shirin Nilizadeh, Apu Kapadia, Yong-Yeol Ahn Indiana University Bloomington CCS 2014.
Clusters Recognition from Large Small World Graph Igor Kanovsky, Lilach Prego Emek Yezreel College, Israel University of Haifa, Israel.
Robust Local Community Detection: On Free Rider Effect and Its Elimination 1 Case Western Reserve University Yubao Wu 1, Ruoming Jin 2, Jing Li 1, Xiang.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
1 Microarray Clustering. 2 Outline Microarrays Hierarchical Clustering K-Means Clustering Corrupted Cliques Problem CAST Clustering Algorithm.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Selected Topics in Data Networking Explore Social Networks:
CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks (WWW2013) BEUTEL, ALEX, WANHONG XU, VENKATESAN GURUSWAMI, CHRISTOPHER.
Community Detection based on Distance Dynamics Reporter: Yi Liu Student ID: Department of Computer Science and Engineering Shanghai Jiao Tong.
Analysis of Massive Data Sets Prof. dr. sc. Siniša Srbljić Doc. dr. sc. Dejan Škvorc Doc. dr. sc. Ante Đerek Faculty of Electrical Engineering and Computing.
Chapter 8.1 vocabulary Relation Is a pairing of numbers or a set of ordered pair {(2,1) (3,5) (6, 3)} Domain: first set of numbers Range: Second set of.
Alan Mislove Bimal Viswanath Krishna P. Gummadi Peter Druschel.
Mining Coherent Dense Subgraphs across Multiple Biological Networks Vahid Mirjalili CSE 891.
The Set-covering Problem Problem statement –given a finite set X and a family F of subsets where every element of X is contained in one of the subsets.
Cohesive Subgraph Computation over Large Graphs
Finding Dense and Connected Subgraphs in Dual Networks
Groups of vertices and Core-periphery structure
Greedy Algorithm for Community Detection
4.8 Functions and Relations
Community detection in graphs
Department of Computer Science University of York
Clustering.
Overcoming Resolution Limits in MDL Community Detection
Graph-Based Anomaly Detection
Basics of Functions and Their Graphs
X Y Relation (a set of ordered pairs) x y x y ( , ) x y Mapping
Approximate Graph Mining with Label Costs
Clustering.
Presentation transcript:

Overlapping Community Detection in Networks Nan Du

Overlapping Community Detection It is possible for each individual to have many communities simultaneously. Question: how can we develop an algorithm to find overlapping communities ? Related work Palla’s CPM algorithm 2006 GN-extensions : CONGA, P&W, 2007 fuzzy k-means 2007

Overlapping Community Detection Palla’s CPM algorithm, 2005 Well-defined k-clique community Required user input parameter k Can not cover all the vertices in the given network CONGA, 2007 Based on defined splitting betweenness to decide when to split vertices, what vertex to split and how to split them Low efficiency on large graph O(m3) P&W, 2007 Based on both of the edge betweenness and vertex betweenness to decide whether to split a vertex or remove an edge, which requires a user input parameter to assess the similarity between pairs of vertices Fuzzy clustering, 2007 requires a user input parameter to indicate an upper bound of the community's number, which is often hard to give in real networks

Overlapping Community Detection A novel algorithm COCD (Clique-based Overlapping Community Detection) is proposed Can cover all the vertices of the given network Free of user input parameters Efficient and scalable

Overlapping Community Detection COCD consists of 3 basic steps Maximal clique enumeration Peamc on sparse graphs Core formation a core is the set of all closely related maximal cliques Clustering Freeman Centrality is used to assign the left vertices to the cores

Overlapping Community Detection Core Formation A core is defined as a set of closely related maximal cliques How to decide whether to merge two cores once they share some common vertices? Solution : Closeness Function

Overlapping Community Detection COCD algorithm Core formation (whether to merge two cores ?) Closeness Function and are the set of maximal cliques containing , and are the induced sub-graphs is the set of edges between and

Overlapping Community Detection COCD algorithm Core formation V0 V1 V2 V3 V4 V5 V6 V7 V8

Overlapping Community Detection COCD algorithm Core formation V1 V2 V3 V4 V5 V6 V7 V8

Overlapping Community Detection COCD algorithm Core formation V0 V1 V2 V3 V5 V6 V7 V8

Overlapping Community Detection COCD algorithm Core formation V0 V1 V2 V3 V4 V5 V6 V7 V8

Overlapping Community Detection Experimental Evaluation On networks with known community structures precision : the fraction of vertex pairs in the same cluster that also belong to the same community recall : the fraction of vertex pairs belonging to the same community that are also in the same cluster On networks with unknown community structures overlap coefficient & vertex average degree (vad)

Overlapping Community Detection Experimental Evaluation 16 Real datasets from different domains

Overlapping Community Detection Experimental Evaluation

Overlapping Community Detection Experimental Evaluation 1.67 1.43 1.45 1.44 Results on networks with unknown community structures

Community Detection Experimental Evaluation Communities of word association network Communities of cell phone network

References S. Gregory. An algorithm to find overlapping community structure in networks. In The PKDD, pages 91-102, 2007 G. Palla, I. Dernyi, and I. Farkas. Uncovering the overlapping community structure of complex network in nature and society. Nature, 435(7043):814-818, June 2005 J. Pinney and D. Westhead. Betweenness-based decomposition methods for social and biological networks. Leeds University Press S. Zhang, R. S. Wang, and X. S. Zhang. Identificationof overlapping community structure in complex networks using fuzzy c-means clustering. PHYSICA, 374(1) N. Du, B. Wu, and B. Wang. A parallel algorithm for enumerating all maximal cliques in complex networks. In ICDM Mining Complex Datd Workshop, pages 320-324, December 2006.