Download presentation

Presentation is loading. Please wait.

Published byAmelia Wordell Modified about 1 year ago

1
Jure Leskovec, CMU Kevin Lang, Anirban Dasgupta and Michael Mahoney Yahoo! Research

2
Communities: Sets of nodes with lots of connections inside and few to outside (the rest of the network) Assumption: Networks are (hierarchically) composed of communities (modules) Communities, clusters, groups, modules

3
How community like is a set of nodes? Want a measure that corresponds to intuition Conductance (normalized cut): Φ(S) = # edges cut / # edges inside Small Φ(S) corresponds to more community-like sets of nodes

4
Score: Φ(S) = # edges cut / # edges inside

5
Bad community Φ=5/7 = 0.7

6
Score: Φ(S) = # edges cut / # edges inside Better community Φ=5/7 = 0.7 Bad community Φ=2/5 = 0.4

7
Score: Φ(S) = # edges cut / # edges inside Better community Φ=5/7 = 0.7 Bad community Φ=2/5 = 0.4 Best community Φ=2/8 = 0.25

8
We define: Network community profile (NCP) plot Plot the score of best community of size k Search over all subsets of size k and find best: Φ(k=5) = 0.25 NCP plot is intractable to compute Use approximation algorithm

9
Dolphin social network Two communities of dolphins NCP plot Network

10
Zachary’s university karate club social network During the study club split into 2 The split (squares vs. circles) corresponds to cut B NCP plotNetwork

11
Collaborations between scientists in Networks NCP plotNetwork

12
Hierarchical network Geometric (grid-like) network – Small social networks – Geometric and – Hierarchical network have downward NCP plot

13
Previously researchers examined community structure of small networks (~100 nodes) We examined more than 70 different large social and information networks Large real-world networks look completely different!

14
Typical example: General relativity collaboration network (4,158 nodes, 13,422 edges)

15
Better and better communities Worse and worse communities Best community has 100 nodes Community score Community size

16
Whiskers are responsible for downward slope of NCP plot Whisker is a set of nodes connected to the network by a single edge NCP plot Largest whisker

17
Each new edge inside the community costs more NCP plot Φ=2/1 = 2 Φ=8/3 = 2.6 Φ=64/11 = 5.8 Each node has twice as many children

18
Network structure: Core-periphery, jellyfish, octopus Whiskers are responsible for good communities Denser and denser core of the network Core contains 60% node and 80% edges

19
What if we allow cuts that give disconnected communities? Cut all whiskers and compose communities out of them

20
Community score Community size We get better community scores when composing disconnected sets of whiskers Connected communities Bag of whiskers

21
Rewired network: random network with same degree distribution

22
What is a good model that explains such network structure? None of the existing models work Pref. attachment Small World Geometric Pref. Attachment Flat Down and Flat Flat and Down

23
Forest Fire: connections spread like a fire New node joins the network Selects a seed node Connects to some of its neighbors Continue recursively As community grows it blends into the core of the network

24
rewired network Bag of whiskers

25
Whiskers: Largest whisker has ~100 nodes Independent of network size Dunbar number: a person can maintain social relationship to 150 people Bond vs. identity communites Core: Newman et al. analyzed 400k node product network ▪ Largest community has 50% nodes ▪ Community was labeled “miscelaneous”

26
NCP plot is a way to analyze network community structure Our results agree with previous work on small networks But large networks are fundamentally different Large networks have core-periphery structure Small well isolated communities blend into the core of the networks as they grow

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google