CS728 Lecture 5 Generative Graph Models and the Web.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks Power law graphs Small world graphs.
Advertisements

Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU
Algorithmic and Economic Aspects of Networks Nicole Immorlica.
Analysis and Modeling of Social Networks Foudalis Ilias.
Jure Leskovec, CMU Lars Backstrom, Cornell Ravi Kumar, Yahoo! Research Andrew Tomkins, Yahoo! Research.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Information Networks Generative processes for Power Laws and Scale-Free networks Lecture 4.
Generative Models for the Web Graph José Rolim. Aim Reproduce emergent properties: –Distribution site size –Connectivity of the Web –Power law distriubutions.
What did we see in the last lecture?. What are we going to talk about today? Generative models for graphs with power-law degree distribution Generative.
SILVIO LATTANZI, D. SIVAKUMAR Affiliation Networks Presented By: Aditi Bhatnagar Under the guidance of: Augustin Chaintreau.
Information Networks Small World Networks Lecture 5.
Advanced Topics in Data Mining Special focus: Social Networks.
Lecture 7 CS 728 Searchable Networks. Errata: Differences between Copying and Preferential Attachment In generative model: let p k be fraction of nodes.
CS 599: Social Media Analysis University of Southern California1 The Basics of Network Analysis Kristina Lerman University of Southern California.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
CSE 522 – Algorithmic and Economic Aspects of the Internet Instructors: Nicole Immorlica Mohammad Mahdian.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Topology Generation Suat Mercan. 2 Outline Motivation Topology Characterization Levels of Topology Modeling Techniques Types of Topology Generators.
Complex Networks Third Lecture TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA TexPoint fonts used in EMF. Read the.
Networks. Graphs (undirected, unweighted) has a set of vertices V has a set of undirected, unweighted edges E graph G = (V, E), where.
Network Models Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Models Why should I use network models? In may 2011, Facebook.
The Barabási-Albert [BA] model (1999) ER Model Look at the distribution of degrees ER ModelWS Model actorspower grid www The probability of finding a highly.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
Social Networks and Graph Mining Christos Faloutsos CMU - MLD.
1 Complex systems Made of many non-identical elements connected by diverse interactions. NETWORK New York Times Slides: thanks to A-L Barabasi.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
CS Lecture 6 Generative Graph Models Part II.
Graphs over time: densification laws, shrinking diameters and possible explanations 1.
Advanced Topics in Data Mining Special focus: Social Networks.
SDSC, skitter (July 1998) A random graph model for massive graphs William Aiello Fan Chung Graham Lincoln Lu.
The structure of the Internet. How are routers connected? Why should we care? –While communication protocols will work correctly on ANY topology –….they.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 7 May 14, 2006
The structure of the Internet. The Internet as a graph Remember: the Internet is a collection of networks called autonomous systems (ASs) The Internet.
On Distinguishing between Internet Power Law B Bu and Towsley Infocom 2002 Presented by.
Summary from Previous Lecture Real networks: –AS-level N= 12709, M=27384 (Jan 02 data) route-views.oregon-ix.net, hhtp://abroude.ripe.net/ris/rawdata –
Computer Science 1 Web as a graph Anna Karpovsky.
Online Social Networks and Media Network models. What is a network model? Informally, a network model is a process (radomized or deterministic) for generating.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
Information Networks Power Laws and Network Models Lecture 3.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Topic 13 Network Models Credits: C. Faloutsos and J. Leskovec Tutorial
Author: M.E.J. Newman Presenter: Guoliang Liu Date:5/4/2012.
Small-world networks. What is it? Everyone talks about the small world phenomenon, but truly what is it? There are three landmark papers: Stanley Milgram.
“Adversarial Deletion in Scale Free Random Graph Process” by A.D. Flaxman et al. Hammad Iqbal CS April 2006.
COLOR TEST COLOR TEST. Social Networks: Structure and Impact N ICOLE I MMORLICA, N ORTHWESTERN U.
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
Clusters Recognition from Large Small World Graph Igor Kanovsky, Lilach Prego Emek Yezreel College, Israel University of Haifa, Israel.
Performance Evaluation Lecture 1: Complex Networks Giovanni Neglia INRIA – EPI Maestro 10 December 2012.
Models of Web-Like Graphs: Integrated Approach
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Network (graph) Models
Graph Models Class Algorithmic Methods of Data Mining
Lecture 1: Complex Networks
Topics In Social Computing (67810)
Peer-to-Peer and Social Networks
How Do “Real” Networks Look?
How Do “Real” Networks Look?
Lecture 13 Network evolution
Graph and Tensor Mining for fun and profit
Lecture 21 Network evolution
Modelling and Searching Networks Lecture 2 – Complex Networks
Network Science: A Short Introduction i3 Workshop
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
What did we see in the last lecture?
Presentation transcript:

CS728 Lecture 5 Generative Graph Models and the Web

Importance of Generative Models Gives insight into the graph formation process: –Anomaly detection – abnormal behavior, evolution –Predictions – predicting future from the past –Simulations and evaluation of new algorithms –Graph sampling – many real world graphs like the web are too large and complex to deal with –Goal: generating graphs with small world property, clustering, power-laws, other naturally occurring structures

Graph Models: Waxman Models Used for models of clustering in Internet-like topologies and networks with long and short edges The vertices are distributed at random in a plane. An edge is added between each pair of vertices with probability p. p(u,v) =  * exp( -d / (  *L) ), 0  ,   1. L is the maximum distance between any two nodes. Increase in alpha increases the number of edges in the graph. Increase in beta increases the number of long edges relative to short edges. d is the Euclidean distance from u to v in Waxman-1. d is a random number between [0, L] in Waxman-2.

Graph Models: Configuration Model Random Graph from given degree sequence Problem: Given a degree sequence, d1,d2, d3, …., dn generate a random graph with that degree sequence Solution: Place di stubs onto vertex I Choose pairs of stubs at random

Problem: we may construct graphs with loops and multiedges To prevent this there must be enough “absorbing” residual degree capacity. Algorithm: Maintain list of nodes sorted by residual degrees d(v) Repeat until all nodes have been chosen: –pick arbitrary vertex v –add edges from v to d(v) vertices of highest residual degree –update residual degrees To randomize further, we can start with a realization and repeatedly 2-swap pairs of edges (u,v), (s,t) to (u,t), (s,v) Works OK, But is there a more ‘natural’ generative model?

Generative Graph models: Preferential attachment Price’s Model [65] : Physics citations – “cummulative advantage” Herb Simon [50’s]: Nobel and Turing Awards, political scientist “rich get richer” (Pareto) Matthew effect / Matilda effect: sociology Barabasi and Albert 99: Preferential attachment: –Add a new node, create d out-links –Probability of linking a node is proportional to its current degree Simple explanation of power-law degree distributions

Issues with preferential attachment and Power-laws Barabasi model fixed constant m for out-degree Price’s model directed with m mean out-degree Probability of adding a new edge is proportional to its (in) degree k – problem at the start degree 0 – Price’s model: prop to deg + 1 –Analysis: prob a node has degree k p k ~ k -3 (Barabasi model) p k ~ k -(2+1/m) power-law with exponent 2-3 (Price) Exercise: give pseudocode that generates such a graph in linear time

Variations on the PA Theme Clustering, Small-World and Ageing Copying Model Alpha and beta Models Temporal Evolution Densification

Graph models: Copying model Copying model [Kleinberg, Kumar, Raghavan, Rajagopalan and Tomkins, 99]: –Add a node and choose the number of edges to add –Choose a random vertex and “copy” its links (neighbors) Also generates power-law degree distributions Generates communities - clustering

Graph Models: The Alpha Model Watts (1999)  model: Add edges to nodes, as in random graphs, but makes links more likely when two nodes have a common friend. For a range of  values: –The world is small (average path length is short), and –Groups tend to form (high clustering coefficient). Probability of linkage as a function of number of mutual friends (  is 0 in upper left, 1 in diagonal, and ∞ in bottom right curves.)

Graph Models: The Beta Model Watts and Strogatz (1998) “Link Rewiring”  = 0  =  = 1 People know others at random. Not clustered, but “small world” People know their neighbors, and a few distant people. Clustered and “small world” People know their neighbors. Clustered, but not a “small world”

Graph Models: The Beta Model First five random links reduce the average path length of the network by half, regardless of N! Both  and  models reproduce short-path results of random graphs, but also allow for clustering. Small-world phenomena occur at threshold between order and chaos. Watts and Strogatz (1998) Clustering coefficient / Normalized path length Clustering coefficient (C) and average path length (L) plotted against 

Other Related Work Hybrid models: Beta + Waxman on grid Huberman and Adamic, 1999: Growth dynamics of the world wide web –Argue against Barabasi model for its age dependence Kumar, Raghavan, Rajagopalan, Sivakumar and Tomkins, 1999: Stochastic models for the web graph Watts, Dodds, Newman, 2002: Identity and search in social networks Medina, Lakhina, Matta, and Byers, 2001: BRITE: An Approach to Universal Topology Generation …

Statistics Statistics of common networks: N - nodes K - degree D - distance C- clique fraction Actors225, Power- grid 4, C.elegans Large k = large c? Small c = large d?

Modeling Ageing and Temporal Evolution N(t) … nodes at time t E(t) … edges at time t Suppose that N(t+1) = 2 * N(t) Q: what is guess for E(t+1) =? 2 * E(t) A: over-doubled?

Temporal Evolution of Graphs Densification Power Law –networks appear denser over time –the number of edges grows faster than the number of nodes – average degree is increasing a … densification exponent or equivalently

Graph Densification Densification Power Law Densification exponent: 1 ≤ a ≤ 2: –a=1: linear growth – constant out- degree (assumed in the literature so far) –a=2: quadratic growth – clique Let’s see the real graphs!

Densification – ArXiv citation graph in Physics Citations among physics papers 1992: –1,293 papers, 2,717 citations 2003: –29,555 papers, 352,807 citations For each month M, create a graph of all citations up to month M N(t) E(t) 1.69

Densification – Patent Citations Citations among patents granted 1975 –334,000 nodes –676,000 edges 1999 –2.9 million nodes –16.5 million edges Each year is a datapoint N(t) E(t) 1.66

Densification – Internet Autonomous Systems Graph of Internet 1997 –3,000 nodes –10,000 edges 2000 –6,000 nodes –26,000 edges One graph per day N(t) E(t) 1.18

Evolution of the Diameter Prior work on Power Law graphs hints at Slowly growing diameter: –diameter ~ O(log N) –diameter ~ O(log log N) What is happening in real data? Diameter shrinks over time –As the network grows the distances between nodes slowly decrease

Diameter – ArXiv citation graph Citations among physics papers 1992 –2003 One graph per year time [years] diameter

Diameter – “Patents” Patent citation network 25 years of data time [years] diameter

Diameter – Autonomous Systems Graph of Internet One graph per day 1997 – 2000 number of nodes diameter

Next Time: Densification – Possible Explanations Generative models to capture the Densification Power Law and Shrinking diameters 2 proposed models: –Community Guided Attachment – obeys Densification –Forest Fire model – obeys Densification, Shrinking diameter (and Power Law degree distribution)