Clustering Social Networks

Slides:



Advertisements
Similar presentations
Network Design with Degree Constraints Guy Kortsarz Joint work with Rohit Khandekar and Zeev Nutov.
Advertisements

Chapter 4 Partition I. Covering and Dominating.
CS 336 March 19, 2012 Tandy Warnow.
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
Bayesian Networks, Winter Yoav Haimovitch & Ariel Raviv 1.
Multicut Lower Bounds via Network Coding Anna Blasiak Cornell University.
Clustering Social Networks Isabelle Stanton, University of Virginia Joint work with Nina Mishra, Robert Schreiber, and Robert E. Tarjan.
Complexity ©D Moshkovitz 1 Approximation Algorithms Is Close Enough Good Enough?
Information Networks Graph Clustering Lecture 14.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Optimization of Pearl’s Method of Conditioning and Greedy-Like Approximation Algorithm for the Vertex Feedback Set Problem Authors: Ann Becker and Dan.
Author: Jie chen and Yousef Saad IEEE transactions of knowledge and data engineering.
Randomized Algorithms for the Loop Cutset Problem Author: Ann Becker, Beuven Bar-Yehuda Dan Geiger Beuven Bar-Yehuda Dan Geiger Class presentation for.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Graph Triangulation by Dmitry Pidan Based on the paper “A sufficiently fast algorithm for finding close to optimal junction tree” by Ann Becker and Dan.
Clustering Social Networks Isabelle Stanton, University of Virginia Joint work with Nina Mishra, Robert Schreiber, and Robert E. Tarjan.
EXPANDER GRAPHS Properties & Applications. Things to cover ! Definitions Properties Combinatorial, Spectral properties Constructions “Explicit” constructions.
Zoë Abrams, Ashish Goel, Serge Plotkin Stanford University Set K-Cover Algorithms for Energy Efficient Monitoring in Wireless Sensor Networks.
CSE 421 Algorithms Richard Anderson Lecture 4. What does it mean for an algorithm to be efficient?
Distributed Combinatorial Optimization
Chapter 9: Graphs Basic Concepts
Problem: Induced Planar Graphs Tim Hayes Mentor: Dr. Fiorini.
Models of Influence in Online Social Networks
A Shortest Path Algorithm. Motivation Given a connected, positive weighted graph Find the length of a shortest path from vertex a to vertex z.
Fixed Parameter Complexity Algorithms and Networks.
Lecture 13 Graphs. Introduction to Graphs Examples of Graphs – Airline Route Map What is the fastest way to get from Pittsburgh to St Louis? What is the.
APPROXIMATION ALGORITHMS VERTEX COVER – MAX CUT PROBLEMS
Modular Decomposition and Interval Graphs recognition Speaker: Asaf Shapira.
Greedy Approximation Algorithms for finding Dense Components in a Graph Paper by Moses Charikar Presentation by Paul Horn.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Introduction to Real Analysis Dr. Weihu Hong Clayton State University 8/21/2008.
1/52 Overlapping Community Search Graph Data Management Lab, School of Computer Science
1/24 Introduction to Graphs. 2/24 Graph Definition Graph : consists of vertices and edges. Each edge must start and end at a vertex. Graph G = (V, E)
CSCI 115 Chapter 8 Topics in Graph Theory. CSCI 115 §8.1 Graphs.
1 Latency-Bounded Minimum Influential Node Selection in Social Networks Incheol Shin
Complexity and Efficient Algorithms Group / Department of Computer Science Testing the Cluster Structure of Graphs Christian Sohler joint work with Artur.
CSE 421 Algorithms Richard Anderson Winter 2009 Lecture 5.
Correlation Clustering Nikhil Bansal Joint Work with Avrim Blum and Shuchi Chawla.
Lecture 19 Minimal Spanning Trees CSCI – 1900 Mathematics for Computer Science Fall 2014 Bill Pine.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Liaoruo Wang and John E. Hopcroft Dept. of Computer Engineering & Computer Science, Cornell University In Proc. 7th Annual Conference on Theory and Applications.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
The NP class. NP-completeness Lecture2. The NP-class The NP class is a class that contains all the problems that can be decided by a Non-Deterministic.
Alan Mislove Bimal Viswanath Krishna P. Gummadi Peter Druschel.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
The NP class. NP-completeness
P & NP.
Cohesive Subgraph Computation over Large Graphs
Approximating Set Cover
Groups of vertices and Core-periphery structure
Main algorithm with recursion: We’ll have a function DFS that initializes, and then calls DFS-Visit, which is a recursive function and does the depth first.
Introduction to Algorithms
Computing Connected Components on Parallel Computers
An Introduction to Computational Geometry
Introduction to Graph Theory
Network Science: A Short Introduction i3 Workshop
CSE373: Data Structures & Algorithms Lecture 16: Introduction to Graphs Linda Shapiro Spring 2016.
The Art Gallery Problem
The Importance of Communities for Learning to Influence
Parameterised Complexity
The Art Gallery Problem
Introduction to Graph Theory Euler and Hamilton Paths and Circuits
Chapter 9: Graphs Basic Concepts
Coverage Approximation Algorithms
Introduction Wireless Ad-Hoc Network
A Fundamental Bi-partition Algorithm of Kernighan-Lin
Self-protection experiment
Ch09 _2 Approximation algorithm
Chapter 9: Graphs Basic Concepts
Presentation transcript:

Clustering Social Networks Nina Mishra et al Presented by Nam Nguyen

(α,β)-Cluster Definition Given a graph G = (V,E) where every vertex has a self-loop, C ⊂ V is an (α,β)-cluster if 1. Internally dense: ∀v ∈ V, |E(v,C)| ≥ β|C| 2. Externally sparse: ∀u ∈ V\C, |E(u,C)| ≤ α|C| u ≥ β|C| ≤ α|C| v

Example {a,b,c,d} and {d,e,f,g} are (1/4, 1)-clusters h and i are do not fall into any (α,β)-cluster for 0≤ α< ½ < β ≤1 thus, they would not be clustered.  (α,β)-cluster are able for detecting overlapping clusters.

Problem definition Objective Identify clusters that are internally dense, i.e., each vertex in the cluster is adjacent to at least a β-fraction of the cluster, and externally sparse, i.e., any vertex outside of the cluster is adjacent to at most an α– fraction of the vertices in the cluster. Given 0≤ α< β ≤1, find all (α,β)-clusters in the network

Contributions of the paper Give a bound for the overlapping of two (α,β)- clusters A and B. They overlap in at most |C|*min{1-(β- α), α/(2β-1)} vertices. If the ratio of |A| and |B| is at most (1- α)/(1- β) then one cluster can not be contained in the other. Give a loose upper bound for the number of (α,1)- clusters of size s: O( (n/s) α+1 ) Introduction of the ρ-champion of a cluster and if β> ½(1+ ρ+ α), there is a simple deterministic algorithm for finding all such clusters in time O(m0.7n1.2 + n2+o(1))

Some minor remarks β  1, the cluster C  a clique α  0, C tends to a disconnected component β< ½ then C might contain two disconnected components. We want α < β and β> ½. (0, β)-clusters  finding connected components & output β-connected ones. (1-1/n, 1)-clusters  finding the maximal cliques in a graph. ((1-ε) β, β)-clusters  finding quasi-cliques.

Result 1 Question: How about the intersection of 3 (or more) (α,β)-clusters of the same size? different size ? How about the intersection of an (α,β)-cluster and an (α’,β’)-cluster of the same size? different size ?

Result 2: Bounding the number of (α,1)-clusters Proof Two clusters of the same size s can share at most αs vertices. Every subset of size (αs+1) must appear in at most one set in C. There are subsets of s elements from n elements, each of these contains subsets of size (αs+1). Therefore, we can have at most clusters in C  |C| ≤ =

This bound is tight … when α = 0 when α  1 ( α = (n-1)/n ) No overlapping  # of clusters of size s = n/s. when α  1 ( α = (n-1)/n ) Consider the complement of the following graph Let s = n = N/2, then the bound is 2n. In fact, we do have 2n subsets of (α, 1)-clusters of size n by choosing from the set B = {b1b2…bn | bi is either xi or yi}

An algorithm for finding clusters with champions Why? In last example, each vertex has as many neighbors outside as within the cluster There is no vertex that “champions” the cluster (having more friends inside than outside) Why not find one who champions and start with it?

Algorithm (cont’d) Assumption: Why? A big gap between β and α/2: β > ½ + (α+ρ)/2 Why? Recall last example: We have 2n possible clusters of size n  Too many Any algorithm that outputs more clusters than nodes are undesirable. Thus, we need some restriction to reduce the # of returned clusters.

Algorithm (cont’d) How many clusters with ρ-champion should we have ? A big gap between β and α/2: β > ½ + (α+ρ)/2 How to find them?

Algorithm (cont’d) If v and c have sufficient many neighbors then v is a part of the cluster C that c champions.  that’s what line #5 for Running time of the algorithm

Experimental Results For real networks Results Datasets Do (α,β)-clusters with ρ-champion exist?  use Tsukiayama algorithm If they do exist, do most (α,β)-clusters have ρ-champion? Results Able to find ~90% of the maximal cliques in graphs where α ≤ ½. No strong ρ-champions in missed clusters. Running time: Weight faster than Tsukiyama’s algorithm Datasets High Energy Physics Theory Co-Author graph (HEP) Theory Co-Author graph (TA) A subset of Live Journal graph (LP)

Results

Results

Results

Results

References [1] Clustering Social Networks, Ninna Mishra, Robert Schreiber, Isabelle Stanton and Robert E. Tarjan (2007)