Clusters Recognition from Large Small World Graph Igor Kanovsky, Lilach Prego Emek Yezreel College, Israel University of Haifa, Israel.

Slides:



Advertisements
Similar presentations
Analysis and Modeling of Social Networks Foudalis Ilias.
Advertisements

Week 5 - Models of Complex Networks I Dr. Anthony Bonato Ryerson University AM8002 Fall 2014.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.
Information Networks Small World Networks Lecture 5.
Advanced Topics in Data Mining Special focus: Social Networks.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Complex Networks Third Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
CS728 Lecture 5 Generative Graph Models and the Web.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Mining and Searching Massive Graphs (Networks)
Graph Algorithms: Minimum Spanning Tree We are given a weighted, undirected graph G = (V, E), with weight function w:
1 Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK New York Times Slides: thanks to A-L Barabasi.
Network Statistics Gesine Reinert. Yeast protein interactions.
CS 728 Lecture 4 It’s a Small World on the Web. Small World Networks It is a ‘small world’ after all –Billions of people on Earth, yet every pair separated.
Web as Graph – Empirical Studies The Structure and Dynamics of Networks.
1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Advanced Topics in Data Mining Special focus: Social Networks.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Additive Spanners for k-Chordal Graphs V. D. Chepoi, F.F. Dragan, C. Yan University Aix-Marseille II, France Kent State University, Ohio, USA.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
1 Analyzing Kleinberg’s (and other) Small-world Models Chip Martel and Van Nguyen Computer Science Department; University of California at Davis.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Computer Science 1 Web as a graph Anna Karpovsky.
Social Media Mining Graph Essentials.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
The Erdös-Rényi models
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Biological Networks Lectures 6-7 : February 02, 2010 Graph Algorithms Review Global Network Properties Local Network Properties 1.
Author: M.E.J. Newman Presenter: Guoliang Liu Date:5/4/2012.
GRAPHS CSE, POSTECH. Chapter 16 covers the following topics Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component,
Approximating the Minimum Degree Spanning Tree to within One from the Optimal Degree R 陳建霖 R 宋彥朋 B 楊鈞羽 R 郭慶徵 R
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Introduction to Graphs. Introduction Graphs are a generalization of trees –Nodes or verticies –Edges or arcs Two kinds of graphs –Directed –Undirected.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Topology and Evolution of the Open Source Software Community Advisors: Dr. Vincent W. Freeh Dr. Kevin Bowyer Supported in part by the National Science.
Self-Similarity of Complex Networks Maksim Kitsak Advisor: H. Eugene Stanley Collaborators: Shlomo Havlin Gerald Paul Zhenhua Wu Yiping Chen Guanliang.
Gennaro Cordasco - How Much Independent Should Individual Contacts be to Form a Small-World? - 19/12/2006 How Much Independent Should Individual Contacts.
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
Mathematics of Networks (Cont)
Networks Igor Segota Statistical physics presentation.
Data Structures & Algorithms Graphs
Complex Networks: Models Lecture 2 Slides by Panayiotis TsaparasPanayiotis Tsaparas.
Most of contents are provided by the website Graph Essentials TJTSD66: Advanced Topics in Social Media.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
The Structure of the Web. Getting to knowing the Web How big is the web and how do you measure it? How many people use the web? How many use search engines?
Topics Paths and Circuits (11.2) A B C D E F G.
Models and Algorithms for Complex Networks Introduction and Background Lecture 1.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
GRAPHS. Graph Graph terminology: vertex, edge, adjacent, incident, degree, cycle, path, connected component, spanning tree Types of graphs: undirected,
An Effective Method to Improve the Resistance to Frangibility in Scale-free Networks Kaihua Xu HuaZhong Normal University.
1 Friends and Neighbors on the Web Presentation for Web Information Retrieval Bruno Lepri.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Analyzing Networks. Milgram’s Experiments “Six degrees of Separation” Milgram’s letters to various recruits in Nebraska who were asked to forward the.
Models of Web-Like Graphs: Integrated Approach
Information Retrieval Search Engine Technology (10) Prof. Dragomir R. Radev.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Cohesive Subgraph Computation over Large Graphs
Minimum Spanning Trees
CS120 Graphs.
Network Science: A Short Introduction i3 Workshop
The Watts-Strogatz model
Minimum Spanning Trees
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Clusters Recognition from Large Small World Graph Igor Kanovsky, Lilach Prego Emek Yezreel College, Israel University of Haifa, Israel © Igor Kanovsky, Lilach Graph2004, Haifa, May 2004

2 Small World Definition(1) Def.1. The characteristic path length L(G) of a graph G = (V;E) is the average length of the shortest path between two vertices in G Def.2. The clustering coefficient C(G)= of a graph G = (V;E) is the average clustering coefficient of its vertices C(v);

© Igor Kanovsky, Lilach Graph2004, Haifa, May Small World Definition(2) Def. 3. Clustering coefficient for vertex v: Where k(v) number of edges indicate v (degree of v), N(v) – neighborhood of v. The clustering coefcient C(v) of a vertex v is the density of the subgraph induced by G[N(v)].

© Igor Kanovsky, Lilach Graph2004, Haifa, May Small World Definition(3) Def. 4. Small World graph is a graph G(V,E) with L~L R and C>>C R where G R (V R,E R ) - a random graph with |V R |=|V|, |E R |=|E|. A lot of real world graphs are Small World graphs: 1.Social relationships. 2.Business (organization) collaborations. 3.The Web. The Internet. 4.Biological data (DNA structure, cells metabolism etc.).

© Igor Kanovsky, Lilach Graph2004, Haifa, May The Watts and Strogatz Small World model In 1998, Watts and Strogatz brought the small-world phenomenon to the attention of researchers in various fields by proposing simple SW model.

© Igor Kanovsky, Lilach Graph2004, Haifa, May The WS Small World model(2) The WS model does not succeed in capturing the properties of the real world graphs.

© Igor Kanovsky, Lilach Graph2004, Haifa, May The Web as a graph The known significant properties of the Web as a graph are: 1.Small world topology. 2.Power-law distributions. 3.Bipartite cliques. 4.“Bow-tie" shape. A huge digraph with similar to the Web graph statistical characteristics is called a Web-like graph.

© Igor Kanovsky, Lilach Graph2004, Haifa, May The Web as a Small World Lada A. Adamic. The Small World Web

© Igor Kanovsky, Lilach Graph2004, Haifa, May Power-Law distributions (PLD) PLD of in- and out-degrees of vertices. The number of web pages having k in links on the page or k out links from the page is proportional to k -  for some constants  in,  out > 2 Andrei Broder, Ravi Kumar and others. Graph structure in the web.2001

© Igor Kanovsky, Lilach Graph2004, Haifa, May Bipartite Small Cores There are a lot of bipartite small cores C i,j (with i,j ≥ 3) in the Web graph (a random graph does not have small cliques). K 3,3 A bipartite core C i,j is a graph on i+j nodes that contains at least one bipartite clique K i,j as a subgraph. This small cliques are the cores of the web communities – set of connected sites with a common content topic.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Bipartite Small Cores (2) Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan and Andrew Tomkins. Extracting large-scale knowledge bases from the web Number of C ij as functions of i.j

© Igor Kanovsky, Lilach Graph2004, Haifa, May "bow-tie" shape The major part of web pages can be divided into four sets: a core made by the strongly connected components (SCC), i.e. pages that are mutually connected to each other, 2 sets (upstream and downstream) made by the pages that can only reach (or be reached by) the pages in the core, and a set (tendril) containing pages that can neither reach nor be reached from the core. The Web graph has a "bow-tie" shape,

© Igor Kanovsky, Lilach Graph2004, Haifa, May Small World Graph Clustering The aim is to find subgraphs with high density of links or to find real communities in real graphs. A lot of approaches: hub and authority method of Klienberg, the edge betweennes method of Newman and Girwan, local density of Virtanen, minimum spanning tree, spectral methods, others traditional clustering methods. Problems: how to define a cluster, how to recognise a cluster from a huge graph.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Small World Graph Clustering Definition : Two vertices v 1, v 2 belongs to the same cluster if and where β -edge weighing parameter (proximity), α - level of the cluster separation for a small world graph. Definition: A set of vertices belongs to cluster is called cluster.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Iterated Clustering Algorithm (ICA) Input: undirected graph G=(V,E), level of cluster separation α; Output: clustering {C 1,C 2,...C k }; Method: i=0; while (V is not empty){ find an arbitrary cluster C in G[V]; i++; Ci=C; V=V-C; } k=i; Advantage(!): it is not necessary to analyze all the graph to find some local clusters.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Algorithm for finding an arbitrary cluster Input: undirected graph G=(V,E), level of cluster separation α; Output: A cluster. Method: ; put arbitrary vertex v into queue Q; while(Q is not empty) { get vertex u from Q; add u to C; for each { if then put w into Q; } }

© Igor Kanovsky, Lilach Graph2004, Haifa, May ICA properties ICA has a polynomial complexity z 2  |V|, where z=|E|/ |V| - average edges density, so it is applicable for real world SW graphs and it is better then other clustering algorithms rely on graph connectivity. ICA intuition is based on big clustering coefficient for SW graphs. number_of_edges_in_G[N(v)]= =

© Igor Kanovsky, Lilach Graph2004, Haifa, May ICA evaluation ICA was tested for simplest clustered SW graphs generated by Watts and Strogatz model. The model is set of SW chains with number of random inter- clusters edges. On the next step the algorithm will be applied to the different real SW graphs and more real models.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Web-like Graph Modeling The aim is to find stochastic processes yields web-like graph. Our integrated approach is based on well known Web graph models extended in order to satisfy all mentioned above statistical properties. We try to keep a web-like graph model as simple as possible, thus it has to have a minimum set of parameters.

© Igor Kanovsky, Lilach Graph2004, Haifa, May At each time step, a new vertex is added and is connected to existing vertex through random number m (  z) of new edges, where the average number of edges per node (z) is constant for a growing graph. The probability that an existing vertex gains an edge is proportional to its in-degree. Extended scale-free model (1)

© Igor Kanovsky, Lilach Graph2004, Haifa, May Simultaneously, z-m directed edges are distributed among all the vertices in the graph by the following rules: (i) the source is chosen with a probability proportional to their out degree, (ii) the target ends is chosen with a probability proportional to their in- degree. Extended scale-free model (2) The model has 3 parameters: average degree z, initial attractiveness of vertex to gain in and out edge A in, A out.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Simulation results. In-degree distribution. Our model. N = 30 K. =8 A in = 2.A out = 6. Web. N = 500 M.

© Igor Kanovsky, Lilach Graph2004, Haifa, May Characteristics of several web-like models NA – not applicable

© Igor Kanovsky, Lilach Graph2004, Haifa, May Thank you. For contacts: igor kanovsky,