1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 A B C
Simplifications of Context-Free Grammars
Variations of the Turing Machine
Introduction to Algorithms
Angstrom Care 培苗社 Quadratic Equation II
AP STUDY SESSION 2.
1
David Burdett May 11, 2004 Package Binding for WS CDL.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
Create an Application Title 1Y - Youth Chapter 5.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
Chapter 7 Sampling and Sampling Distributions
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Photo Slideshow Instructions (delete before presenting or this page will show when slideshow loops) 1.Set PowerPoint to work in Outline. View/Normal click.
Biostatistics Unit 5 Samples Needs to be completed. 12/24/13.
Break Time Remaining 10:00.
The basics for simulations
EE, NCKU Tien-Hao Chang (Darby Chang)
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
1 Generating Network Topologies That Obey Power LawsPalmer/Steffan Carnegie Mellon Generating Network Topologies That Obey Power Laws Christopher R. Palmer.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
1 Atomic Routing Games on Maximum Congestion Costas Busch Department of Computer Science Louisiana State University Collaborators: Rajgopal Kannan, LSU.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
Chapter 1: Expressions, Equations, & Inequalities
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.
Subtraction: Adding UP
: 3 00.
5 minutes.
1 Non Deterministic Automata. 2 Alphabet = Nondeterministic Finite Accepter (NFA)
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Essential Cell Biology
Converting a Fraction to %
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Clock will move after 1 minute
Numerical Analysis 1 EE, NCKU Tien-Hao Chang (Darby Chang)
Physics for Scientists & Engineers, 3rd Edition
Select a time to count down from the clock above
16. Mean Square Estimation
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
9. Two Functions of Two Random Variables
1 Dr. Scott Schaefer Least Squares Curves, Rational Representations, Splines and Continuity.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
Kronecker Graphs: An Approach to Modeling Networks Jure Leskovec, Deepayan Chakrabarti, Jon Kleinberg, Christos Faloutsos, Zoubin Ghahramani Presented.
CS728 Lecture 5 Generative Graph Models and the Web.
Modeling Real Graphs using Kronecker Multiplication
CS Lecture 6 Generative Graph Models Part II.
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-Line Social Networks Anthony Bonato Ryerson University WAW’2009 February 13, 2009 nt.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Modeling networks using Kronecker multiplication
Lecture 13 Network evolution
Graph and Tensor Mining for fun and profit
Lecture 21 Network evolution
Presentation transcript:

1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos Faloutsos, CMU

School of Computer Science Carnegie Mellon 2 Graphs are everywhere What can we do with graphs? What patterns or “laws” hold for most real-world graphs? Can we build models of graph generation and evolution? “Needle exchange” networks of drug users Introduction

School of Computer Science Carnegie Mellon 3 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Experiments Observations and Conclusion

School of Computer Science Carnegie Mellon 4 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Experiments Observations and Conclusion

School of Computer Science Carnegie Mellon 5 Static Graph Patterns (1) Power Law degree distributions log(Degree) Many low- degree nodes Few high- degree nodes Internet in December 1998 Y=a*X b log(Count)

School of Computer Science Carnegie Mellon 6 Static Graph Patterns (2) Small-world [Watts, Strogatz]++ 6 degrees of separation Small diameter Effective diameter: Distance at which 90% of pairs of nodes are reachable Hops # Reachable pairs Effective Diameter Epinions who-trusts- whom social network

School of Computer Science Carnegie Mellon 7 Static Graph Patterns (3) Scree plot [Chakrabarti et al] Eigenvalues of graph adjacency matrix follow a power law Network values (components of principal eigenvector) also follow a power-law Rank Eigenvalue Scree Plot

School of Computer Science Carnegie Mellon 8 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Observations and Conclusion

School of Computer Science Carnegie Mellon 9 Temporal Graph Patterns Conventional Wisdom: Constant average degree: the number of edges grows linearly with the number of nodes Slowly growing diameter: as the network grows the distances between nodes grow Recently found [Leskovec, Kleinberg and Faloutsos, 2005]: Densification Power Law: networks are becoming denser over time Shrinking Diameter: diameter is decreasing as the network grows

School of Computer Science Carnegie Mellon 10 Temporal Patterns – Densification Densification Power Law N(t) … nodes at time t E(t) … edges at time t Suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t) A: over-doubled! But obeying the Densification Power Law N(t) E(t) 1.69 Densification Power Law

School of Computer Science Carnegie Mellon 11 Temporal Patterns – Densification Densification Power Law networks are becoming denser over time the number of edges grows faster than the number of nodes – average degree is increasing Densification exponent a: 1 ≤ a ≤ 2: a=1: linear growth – constant out-degree (assumed in the literature so far) a=2: quadratic growth – clique

School of Computer Science Carnegie Mellon 12 Temporal Patterns – Diameter Prior work on Power Law graphs hints at Slowly growing diameter: diameter ~ O(log N) diameter ~ O(log log N) Diameter Shrinks/Stabilizes over time As the network grows the distances between nodes slowly decrease time [years] diameter Diameter over time

School of Computer Science Carnegie Mellon 13 Patterns hold in many graphs All these patterns can be observed in many real life graphs: World wide web [Barabasi] On-line communities [Holme, Edling, Liljeros] Who call whom telephone networks [Cortes] Autonomous systems [Faloutsos, Faloutsos, Faloutsos] Internet backbone – routers [Faloutsos, Faloutsos, Faloutsos] Movie – actors [Barabasi] Science citations [Leskovec, Kleinberg, Faloutsos] Co-authorship [Leskovec, Kleinberg, Faloutsos] Sexual relationships [Liljeros] Click-streams [Chakrabarti]

School of Computer Science Carnegie Mellon 14 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Small Diameter Power Law eigenvalue and eigenvector distribution Dynamic Patterns Growth Power Law Shrinking/Constant Diameters And ideally we would like to prove them

School of Computer Science Carnegie Mellon 15 Graph Generators Lots of work Random graph [Erdos and Renyi, 60s] Preferential Attachment [Albert and Barabasi, 1999] Copying model [Kleinberg, Kumar, Raghavan, Rajagopalan and Tomkins, 1999] Community Guided Attachment and Forest Fire Model [Leskovec, Kleinberg and Faloutsos, 2005] Also work on Web graph and virus propagation [Ganesh et al, Satorras and Vespignani]++ But all of these Do not obey all the patterns Or we are not able prove them

School of Computer Science Carnegie Mellon 16 Why is all this important? Simulations of new algorithms where real graphs are impossible to collect Predictions – predicting future from the past Graph sampling – many real world graphs are too large to deal with What if scenarios

School of Computer Science Carnegie Mellon 17 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Observations and Conclusion

School of Computer Science Carnegie Mellon 18 Problem Definition Given a growing graph with count of nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Idea: Self-similarity Leads to power laws Communities within communities …

School of Computer Science Carnegie Mellon 19 There are many obvious (but wrong) ways Does not obey Densification Power Law Has increasing diameter Kronecker Product is exactly what we need Recursive Graph Generation There are many obvious (but wrong) ways Initial graph Recursive expansion

School of Computer Science Carnegie Mellon 20 Adjacency matrix Kronecker Product – a Graph Intermediate stage Adjacency matrix

School of Computer Science Carnegie Mellon 21 Kronecker Product – a Graph Continuing multypling with G 1 we obtain G 4 and so on … G 4 adjacency matrix

School of Computer Science Carnegie Mellon 22 Kronecker Graphs – Formally: We create the self-similar graphs recursively: Start with a initiator graph G 1 on N 1 nodes and E 1 edges The recursion will then product larger graphs G 2, G 3, …G k on N 1 k nodes Since we want to obey Densification Power Law graph G k has to have E 1 k edges

School of Computer Science Carnegie Mellon 23 Kronecker Product – Definition The Kronecker product of matrices A and B is given by We define a Kronecker product of two graphs as a Kronecker product of their adjacency matrices N x MK x L N*K x M*L

School of Computer Science Carnegie Mellon 24 Kronecker Graphs We propose a growing sequence of graphs by iterating the Kronecker product Each Kronecker multiplication exponentially increases the size of the graph

School of Computer Science Carnegie Mellon 25 Kronecker Graphs – Intuition Intuition: Recursive growth of graph communities Nodes get expanded to micro communities Nodes in sub-community link among themselves and to nodes from different communities

School of Computer Science Carnegie Mellon 26 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Experiments Conclusion

School of Computer Science Carnegie Mellon 27 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/stabilizing Diameters

School of Computer Science Carnegie Mellon 28 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/stabilizing Diameters

School of Computer Science Carnegie Mellon 29 Properties of Kronecker Graphs Theorem: Kronecker Graphs have Multinomial in- and out-degree distribution (which can be made to behave like a Power Law) Proof: Let G 1 have degrees d 1, d 2, …, d N Kronecker multiplication with a node of degree d gives degrees d∙d 1, d∙d 2, …, d∙d N After Kronecker powering G k has multinomial degree distribution

School of Computer Science Carnegie Mellon 30 Eigen-value/-vector Distribution Theorem: The Kronecker Graph has multinomial distribution of its eigenvalues Theorem: The components of each eigenvector in Kronecker Graph follow a multinomial distribution Proof: Trivial by properties of Kronecker multiplication

School of Computer Science Carnegie Mellon 31 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters

School of Computer Science Carnegie Mellon 32 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters

School of Computer Science Carnegie Mellon 33 Temporal Patterns: Densification Theorem: Kronecker graphs follow a Densification Power Law with densification exponent Proof: If G 1 has N 1 nodes and E 1 edges then G k has N k = N 1 k nodes and E k = E 1 k edges And then E k = N k a Which is a Densification Power Law

School of Computer Science Carnegie Mellon 34 Constant Diameter Theorem: If G 1 has diameter d then graph G k also has diameter d Theorem: If G 1 has diameter d then q-effective diameter if G k converges to d q-effective diameter is distance at which q% of the pairs of nodes are reachable

School of Computer Science Carnegie Mellon 35 Constant Diameter – Proof Sketch Observation: Edges in Kronecker graphs: where X are appropriate nodes Example:

School of Computer Science Carnegie Mellon 36 Problem Definition Given a growing graph with nodes N 1, N 2, … Generate a realistic sequence of graphs that will obey all the patterns Static Patterns Power Law Degree Distribution Power Law eigenvalue and eigenvector distribution Small Diameter Dynamic Patterns Growth Power Law Shrinking/Stabilizing Diameters First and the only generator for which we can prove all the properties

School of Computer Science Carnegie Mellon 37 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Experiments Observations and Conclusion

School of Computer Science Carnegie Mellon 38 Kronecker Graphs Kronecker Graphs have all desired properties But they produce “staircase effects” We introduce a probabilistic version Stochastic Kronecker Graphs Degree Rank Count Eigenvalue

School of Computer Science Carnegie Mellon 39 How to randomize a graph? We want a randomized version of Kronecker Graphs Obvious solution Randomly add/remove some edges Wrong! – is not biased adding random edges destroys degree distribution, diameter, … Want add/delete edges in a biased way How to randomize properly and maintain all the properties?

School of Computer Science Carnegie Mellon 40 Stochastic Kronecker Graphs Create N 1  N 1 probability matrix P 1 Compute the k th Kronecker power P k For each entry p uv of P k include an edge (u,v) with probability p uv P1P1 Instance Matrix G PkPk flip biased coins Kronecker multiplication

School of Computer Science Carnegie Mellon 41 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Experiments Conclusion

School of Computer Science Carnegie Mellon 42 Experiments How well can we match real graphs? Arxiv: physics citations: 30,000 papers, 350,000 citations 10 years of data U.S. Patent citation network 4 million patents, 16 million citations 37 years of data Autonomous systems – graph of internet Single snapshot from January ,400 nodes, 26,000 edges We show both static and temporal patterns

School of Computer Science Carnegie Mellon 43 Arxiv – Degree Distribution Count Degree Real graph Deterministic Kronecker Stochastic Kronecker

School of Computer Science Carnegie Mellon 44 Arxiv – Scree Plot Rank Eigenvalue Real graph Deterministic Kronecker Stochastic Kronecker

School of Computer Science Carnegie Mellon 45 Arxiv – Densification Nodes(t) Edges Real graph Deterministic Kronecker Stochastic Kronecker

School of Computer Science Carnegie Mellon 46 Arxiv – Effective Diameter Nodes(t) Diameter Real graph Deterministic Kronecker Stochastic Kronecker

School of Computer Science Carnegie Mellon 47 Arxiv citation network

School of Computer Science Carnegie Mellon 48 U.S. Patent citations Static patternsTemporal patterns

School of Computer Science Carnegie Mellon 49 Autonomous Systems Static patterns

School of Computer Science Carnegie Mellon 50 How to choose initiator G 1 ? Open problem Kronecker division/root Work in progress We used heuristics We restricted the space of all parameters Details are in the paper

School of Computer Science Carnegie Mellon 51 Outline Introduction Static graph patterns Temporal graph patterns Proposed graph generation model Kronecker Graphs Properties of Kronecker Graphs Stochastic Kronecker Graphs Experiments Observations and Conclusion

School of Computer Science Carnegie Mellon 52 Observations Generality Stochastic Kronecker Graphs include Erdos-Renyi model and RMAT graph generator as a special case Phase transitions Similarly to Erdos-Renyi model Kronecker graphs exhibit phase transitions in the size of giant component and the diameter We think additional properties will be easy to prove (clustering coefficient, number of triangles, …)

School of Computer Science Carnegie Mellon 53 Conclusion (1) We propose a family of Kronecker Graph generators We use the Kronecker Product We introduce a randomized version Stochastic Kronecker Graphs

School of Computer Science Carnegie Mellon 54 Conclusion (2) The resulting graphs have All the static properties Heavy tailed degree distributions Small diameter Multinomial eigenvalues and eigenvectors All the temporal properties Densification Power Law Shrinking/Stabilizing Diameters We can formally prove these results

School of Computer Science Carnegie Mellon 55 Thank you! Questions?

School of Computer Science Carnegie Mellon 56 Stochastic Kronecker Graphs We define Stochastic Kronecker Graphs Start with N 1  N 1 probability matrix P 1 where p ij denotes probability that edge (i,j) is present Compute the k th Kronecker power P k For each entry p uv of P k we include an edge (u,v) with probability p uv