Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta and Michael Mahoney Yahoo! Research.

Slides:



Advertisements
Similar presentations
Class 12: Communities Network Science: Communities Dr. Baruch Barzel.
Advertisements

An Evaluation of Community Detection Algorithms on Large-Scale Traffic 1 An Evaluation of Community Detection Algorithms on Large-Scale Traffic.
Jure Leskovec (Stanford) Kevin Lang (Yahoo! Research) Michael Mahoney (Stanford)
Community Detection Laks V.S. Lakshmanan (based on Girvan & Newman. Finding and evaluating community structure in networks. Physical Review E 69,
Community Detection Algorithm and Community Quality Metric Mingming Chen & Boleslaw K. Szymanski Department of Computer Science Rensselaer Polytechnic.
1 Connectivity Structure of Bipartite Graphs via the KNC-Plot Erik Vee joint work with Ravi Kumar, Andrew Tomkins.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
Patterns of Influence in a Recommendation Network Jure Leskovec, CMU Ajit Singh, CMU Jon Kleinberg, Cornell School of Computer Science Carnegie Mellon.
Information Networks Graph Clustering Lecture 14.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta, Michael Mahoney Yahoo! Research.
Structural Inference of Hierarchies in Networks BY Yu Shuzhi 27, Mar 2014.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
Learning using Graph Mincuts Shuchi Chawla Carnegie Mellon University 1/11/2003.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
CS Lecture 6 Generative Graph Models Part II.
Sampling from Large Graphs. Motivation Our purpose is to analyze and model social networks –An online social network graph is composed of millions of.
Community Structure in Large Social and Information Networks Michael W. Mahoney Stanford University (Joint work with Kevin Lang and Anirban Dasgupta of.
Community Structure in Large Social and Information Networks Michael W. Mahoney (Joint work at Yahoo with Kevin Lang and Anirban Dasgupta, and also Jure.
Web Projections Learning from Contextual Subgraphs of the Web Jure Leskovec, CMU Susan Dumais, MSR Eric Horvitz, MSR.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
Community Structure in Large Social and Information Networks Michael W. Mahoney Yahoo Research ( For more info, see:
Approximation Algorithms as “Experimental Probes” of Informatics Graphs Michael W. Mahoney Stanford University ( For more info, see: cs.stanford.edu/people/mmahoney/
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Research Meeting Seungseok Kang Center for E-Business Technology Seoul National University Seoul, Korea.
Survey on Evolving Graphs Research Speaker: Chenghui Ren Supervisors: Prof. Ben Kao, Prof. David Cheung 1.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Eric Horvitz, Michael Mahoney,
Optimal serverless networks attacks, complexity and some approximate algorithms Carlos Aguirre Maeso Escuela Politécnica Superior Universidad Autónoma.
Community detection algorithms: a comparative analysis Santo Fortunato.
Jure Leskovec PhD: Machine Learning Department, CMU Now: Computer Science Department, Stanford University.
Models and Algorithms for Complex Networks Graph Clustering and Network Communities.
MEDUSA – New Model of Internet Topology Using k-shell Decomposition Shai Carmi Shlomo Havlin Bloomington 05/24/2005.
Jure Leskovec Computer Science Department Cornell University / Stanford University Joint work with: Jon Kleinberg (Cornell), Christos.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Subsampling Graphs 1. RECAP OF PAGERANK-NIBBLE 2.
Jure Leskovec Kevin J. Lang Anirban Dasgupta Michael W. Mahoney WWW’ 2008 Statistical Properties of Community Structure in Large Social and Information.
How Do “Real” Networks Look?
1 CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014 Network models Tamer Kahveci.
March 3, 2009 Network Analysis Valerie Cardenas Nicolson Assistant Adjunct Professor Department of Radiology and Biomedical Imaging.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
Selected Topics in Data Networking Explore Social Networks:
Hierarchical clustering approaches for high-throughput data Colin Dewey BMI/CS 576 Fall 2015.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Topics In Social Computing (67810) Module 1 (Structure) Centrality Measures, Graph Clustering Random Walks on Graphs.
Graph clustering to detect network modules
Yu Wang1, Gao Cong2, Guojie Song1, Kunqing Xie1
Random Walk for Similarity Testing in Complex Networks
Cohesive Subgraph Computation over Large Graphs
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Groups of vertices and Core-periphery structure
Topics In Social Computing (67810)
Community detection in graphs
Overview of Communities in networks
CS224W: Social and Information Network Analysis
Hierarchical clustering approaches for high-throughput data
Discovering Clusters in Graphs
Lecture 13 Network evolution
Community Detection: Overlapping Communities
Statistical properties of network community structure
Small World Networks Scotty Smith February 7, 2007.
Network Science: A Short Introduction i3 Workshop
Approximating the Community Structure of the Long Tail
Lecture 21 Network evolution
Detecting Important Nodes to Community Structure
Affiliation Network Models of Clusters in Networks
Shan Lu, Jieqi Kang, Weibo Gong, Don Towsley UMASS Amherst
Analysis of Large Graphs: Community Detection
Presentation transcript:

Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta and Michael Mahoney Yahoo! Research

 Communities:  Sets of nodes with lots of connections inside and few to outside (the rest of the network)  Assumption:  Networks are (hierarchically) composed of communities (modules) Communities, clusters, groups, modules

 How community like is a set of nodes?  Want a measure that corresponds to intuition  Conductance (normalized cut): Φ(S) = # edges cut / # edges inside  Small Φ(S) corresponds to more community-like sets of nodes

Score: Φ(S) = # edges cut / # edges inside

Bad community Φ=5/7 = 0.7

Score: Φ(S) = # edges cut / # edges inside Better community Φ=5/7 = 0.7 Bad community Φ=2/5 = 0.4

Score: Φ(S) = # edges cut / # edges inside Better community Φ=5/7 = 0.7 Bad community Φ=2/5 = 0.4 Best community Φ=2/8 = 0.25

 We define: Network community profile (NCP) plot Plot the score of best community of size k  Search over all subsets of size k and find best: Φ(k=5) = 0.25  NCP plot is intractable to compute  Use approximation algorithm

 Dolphin social network  Two communities of dolphins NCP plot Network

 Zachary’s university karate club social network  During the study club split into 2  The split (squares vs. circles) corresponds to cut B NCP plotNetwork

 Collaborations between scientists in Networks NCP plotNetwork

Hierarchical network Geometric (grid-like) network – Small social networks – Geometric and – Hierarchical network have downward NCP plot

 Previously researchers examined community structure of small networks (~100 nodes)  We examined more than 70 different large social and information networks Large real-world networks look completely different!

 Typical example: General relativity collaboration network (4,158 nodes, 13,422 edges)

Better and better communities Worse and worse communities Best community has 100 nodes Community score Community size

 Whiskers are responsible for downward slope of NCP plot Whisker is a set of nodes connected to the network by a single edge NCP plot Largest whisker

 Each new edge inside the community costs more NCP plot Φ=2/1 = 2 Φ=8/3 = 2.6 Φ=64/11 = 5.8 Each node has twice as many children

Network structure: Core-periphery, jellyfish, octopus Whiskers are responsible for good communities Denser and denser core of the network Core contains 60% node and 80% edges

What if we allow cuts that give disconnected communities? Cut all whiskers and compose communities out of them

Community score Community size We get better community scores when composing disconnected sets of whiskers Connected communities Bag of whiskers

Rewired network: random network with same degree distribution

 What is a good model that explains such network structure?  None of the existing models work Pref. attachment Small World Geometric Pref. Attachment Flat Down and Flat Flat and Down

 Forest Fire: connections spread like a fire  New node joins the network  Selects a seed node  Connects to some of its neighbors  Continue recursively As community grows it blends into the core of the network

rewired network Bag of whiskers

 Whiskers:  Largest whisker has ~100 nodes  Independent of network size  Dunbar number: a person can maintain social relationship to 150 people  Bond vs. identity communites  Core:  Newman et al. analyzed 400k node product network ▪ Largest community has 50% nodes ▪ Community was labeled “miscelaneous”

 NCP plot is a way to analyze network community structure  Our results agree with previous work on small networks  But large networks are fundamentally different  Large networks have core-periphery structure  Small well isolated communities blend into the core of the networks as they grow