Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU

Slides:



Advertisements
Similar presentations
Numbers Treasure Hunt Following each question, click on the answer. If correct, the next page will load with a graphic first – these can be used to check.
Advertisements

1 A B C
Simplifications of Context-Free Grammars
Angstrom Care 培苗社 Quadratic Equation II
AP STUDY SESSION 2.
1
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
David Burdett May 11, 2004 Package Binding for WS CDL.
Introduction to Algorithms 6.046J/18.401J
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Local Customization Chapter 2. Local Customization 2-2 Objectives Customization Considerations Types of Data Elements Location for Locally Defined Data.
Process a Customer Chapter 2. Process a Customer 2-2 Objectives Understand what defines a Customer Learn how to check for an existing Customer Learn how.
Custom Statutory Programs Chapter 3. Customary Statutory Programs and Titles 3-2 Objectives Add Local Statutory Programs Create Customer Application For.
CALENDAR.
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt BlendsDigraphsShort.
Chapter 7 Sampling and Sampling Distributions
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
A Fractional Order (Proportional and Derivative) Motion Controller Design for A Class of Second-order Systems Center for Self-Organizing Intelligent.
Break Time Remaining 10:00.
The basics for simulations
Turing Machines.
Table 12.1: Cash Flows to a Cash and Carry Trading Strategy.
PP Test Review Sections 6-1 to 6-6
Multicore Programming Skip list Tutorial 10 CS Spring 2010.
1 Generating Network Topologies That Obey Power LawsPalmer/Steffan Carnegie Mellon Generating Network Topologies That Obey Power Laws Christopher R. Palmer.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Briana B. Morrison Adapted from William Collins
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Outline Minimum Spanning Tree Maximal Flow Algorithm LP formulation 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
Scale Free Networks.
1 Dynamics of Real-world Networks Jure Leskovec Machine Learning Department Carnegie Mellon University
1 Realistic Graph Generation and Evolution Using Kronecker Multiplication Jurij Leskovec, CMU Deepay Chakrabarti, CMU/Yahoo Jon Kleinberg, Cornell Christos.
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Adding Up In Chunks.
MaK_Full ahead loaded 1 Alarm Page Directory (F11)
1 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt 10 pt 15 pt 20 pt 25 pt 5 pt Synthetic.
Artificial Intelligence
1 Using Bayesian Network for combining classifiers Leonardo Nogueira Matos Departamento de Computação Universidade Federal de Sergipe.
Before Between After.
Subtraction: Adding UP
: 3 00.
1 hi at no doifpi me be go we of at be do go hi if me no of pi we Inorder Traversal Inorder traversal. n Visit the left subtree. n Visit the node. n Visit.
1 Let’s Recapitulate. 2 Regular Languages DFAs NFAs Regular Expressions Regular Grammars.
Speak Up for Safety Dr. Susan Strauss Harassment & Bullying Consultant November 9, 2012.
Essential Cell Biology
Converting a Fraction to %
Clock will move after 1 minute
PSSA Preparation.
Physics for Scientists & Engineers, 3rd Edition
Energy Generation in Mitochondria and Chlorplasts
Select a time to count down from the clock above
Copyright Tim Morris/St Stephen's School
1.step PMIT start + initial project data input Concept Concept.
1 Decidability continued…. 2 Theorem: For a recursively enumerable language it is undecidable to determine whether is finite Proof: We will reduce the.
Lecture 21 Network evolution Slides are modified from Jurij Leskovec, Jon Kleinberg and Christos Faloutsos.
What did we see in the last lecture?. What are we going to talk about today? Generative models for graphs with power-law degree distribution Generative.
SILVIO LATTANZI, D. SIVAKUMAR Affiliation Networks Presented By: Aditi Bhatnagar Under the guidance of: Augustin Chaintreau.
CS728 Lecture 5 Generative Graph Models and the Web.
CS Lecture 6 Generative Graph Models Part II.
Graphs over time: densification laws, shrinking diameters and possible explanations 1.
RTM: Laws and a Recursive Generator for Weighted Time-Evolving Graphs Leman Akoglu, Mary McGlohon, Christos Faloutsos Carnegie Mellon University School.
Graph Models Class Algorithmic Methods of Data Mining
Lecture 13 Network evolution
Lecture 21 Network evolution
What did we see in the last lecture?
Presentation transcript:

Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU Graphs over Time Densification Laws, Shrinking Diameters and Possible Explanations Jurij Leskovec, CMU Jon Kleinberg, Cornell Christos Faloutsos, CMU

“Needle exchange” networks of drug users Introduction What can we do with graphs? What patterns or “laws” hold for most real-world graphs? How do the graphs evolve over time? Can we generate synthetic but “realistic” graphs? “Needle exchange” networks of drug users

Evolution of the Graphs How do graphs evolve over time? Conventional Wisdom: Constant average degree: the number of edges grows linearly with the number of nodes Slowly growing diameter: as the network grows the distances between nodes grow Our findings: Densification Power Law: networks are becoming denser over time Shrinking Diameter: diameter is decreasing as the network grows

Outline Introduction General patterns and generators Graph evolution – Observations Densification Power Law Shrinking Diameters Proposed explanation Community Guided Attachment Proposed graph generation model Forest Fire Model Conclusion

Outline Introduction General patterns and generators Graph evolution – Observations Densification Power Law Shrinking Diameters Proposed explanation Community Guided Attachment Proposed graph generation model Forest Fire Model Conclusion

Graph Patterns Power Law Y=a*Xb log(Count) vs. log(Degree) Many low-degree nodes Few high-degree nodes log(Count) vs. log(Degree) Internet in December 1998 Y=a*Xb

Graph Patterns Small-world [Watts and Strogatz],++: 6 degrees of separation Small diameter (Community structure, …) # reachable pairs Effective Diameter hops

Graph models: Random Graphs How can we generate a realistic graph? given the number of nodes N and edges E Random graph [Erdos & Renyi, 60s]: Pick 2 nodes at random and link them Does not obey Power laws No community structure

Graph models: Preferential attachment Preferential attachment [Albert & Barabasi, 99]: Add a new node, create M out-links Probability of linking a node is proportional to its degree Examples: Citations: new citations of a paper are proportional to the number it already has Rich get richer phenomena Explains power-law degree distributions But, all nodes have equal (constant) out-degree

Graph models: Copying model Copying model [Kleinberg, Kumar, Raghavan, Rajagopalan and Tomkins, 99]: Add a node and choose the number of edges to add Choose a random vertex and “copy” its links (neighbors) Generates power-law degree distributions Generates communities

Other Related Work Huberman and Adamic, 1999: Growth dynamics of the world wide web Kumar, Raghavan, Rajagopalan, Sivakumar and Tomkins, 1999: Stochastic models for the web graph Watts, Dodds, Newman, 2002: Identity and search in social networks Medina, Lakhina, Matta, and Byers, 2001: BRITE: An Approach to Universal Topology Generation …

Why is all this important? Gives insight into the graph formation process: Anomaly detection – abnormal behavior, evolution Predictions – predicting future from the past Simulations of new algorithms Graph sampling – many real world graphs are too large to deal with

Outline Introduction General patterns and generators Graph evolution – Observations Densification Power Law Shrinking Diameters Proposed explanation Community Guided Attachment Proposed graph generation model Forest Fire Model Conclusion

Temporal Evolution of the Graphs N(t) … nodes at time t E(t) … edges at time t Suppose that N(t+1) = 2 * N(t) Q: what is your guess for E(t+1) =? 2 * E(t) A: over-doubled! But obeying the Densification Power Law

Temporal Evolution of the Graphs Densification Power Law networks are becoming denser over time the number of edges grows faster than the number of nodes – average degree is increasing a … densification exponent or equivalently

Graph Densification – A closer look Densification Power Law Densification exponent: 1 ≤ a ≤ 2: a=1: linear growth – constant out-degree (assumed in the literature so far) a=2: quadratic growth – clique Let’s see the real graphs!

Densification – Physics Citations Citations among physics papers 1992: 1,293 papers, 2,717 citations 2003: 29,555 papers, 352,807 citations For each month M, create a graph of all citations up to month M E(t) 1.69 N(t)

Densification – Patent Citations Citations among patents granted 1975 334,000 nodes 676,000 edges 1999 2.9 million nodes 16.5 million edges Each year is a datapoint E(t) 1.66 N(t)

Densification – Autonomous Systems Graph of Internet 1997 3,000 nodes 10,000 edges 2000 6,000 nodes 26,000 edges One graph per day E(t) 1.18 N(t)

Densification – Affiliation Network Authors linked to their publications 1992 318 nodes 272 edges 2002 60,000 nodes 20,000 authors 38,000 papers 133,000 edges E(t) 1.15 N(t)

Graph Densification – Summary The traditional constant out-degree assumption does not hold Instead: the number of edges grows faster than the number of nodes – average degree is increasing

Outline Introduction General patterns and generators Graph evolution – Observations Densification Power Law Shrinking Diameters Proposed explanation Community Guided Attachment Proposed graph generation model Forest Fire Model Conclusion

Evolution of the Diameter Prior work on Power Law graphs hints at Slowly growing diameter: diameter ~ O(log N) diameter ~ O(log log N) What is happening in real data? Diameter shrinks over time As the network grows the distances between nodes slowly decrease Diameter first, DPL second Check diameter formulas As the network grows the distances between nodes slowly grow

Diameter – ArXiv citation graph Citations among physics papers 1992 –2003 One graph per year time [years]

Diameter – “Autonomous Systems” Graph of Internet One graph per day 1997 – 2000 number of nodes

Diameter – “Affiliation Network” Graph of collaborations in physics – authors linked to papers 10 years of data time [years]

Diameter – “Patents” diameter Patent citation network 25 years of data time [years]

Validating Diameter Conclusions There are several factors that could influence the Shrinking diameter Effective Diameter: Distance at which 90% of pairs of nodes is reachable Problem of “Missing past” How do we handle the citations outside the dataset? Disconnected components None of them matters

Outline Introduction General patterns and generators Graph evolution – Observations Densification Power Law Shrinking Diameters Proposed explanation Community Guided Attachment Proposed graph generation model Forest Fire Mode Conclusion

Densification – Possible Explanation Existing graph generation models do not capture the Densification Power Law and Shrinking diameters Can we find a simple model of local behavior, which naturally leads to observed phenomena? Yes! We present 2 models: Community Guided Attachment – obeys Densification Forest Fire model – obeys Densification, Shrinking diameter (and Power Law degree distribution)

Self-similar university Community structure Let’s assume the community structure One expects many within-group friendships and fewer cross-group ones How hard is it to cross communities? University Science Arts CS Math Drama Music Animation: first faces, then comminities, then edges Self-similar university community structure

Fundamental Assumption If the cross-community linking probability of nodes at tree-distance h is scale-free We propose cross-community linking probability: where: c ≥ 1 … the Difficulty constant h … tree-distance Animation with communities: f(1), f(2), … f(h)

Densification Power Law (1) Theorem: The Community Guided Attachment leads to Densification Power Law with exponent a … densification exponent b … community structure branching factor c … difficulty constant Animation for a,b,c

Difficulty Constant Theorem: Gives any non-integer Densification exponent If c = 1: easy to cross communities Then: a=2, quadratic growth of edges – near clique If c = b: hard to cross communities Then: a=1, linear growth of edges – constant out-degree

Room for Improvement Community Guided Attachment explains Densification Power Law Issues: Requires explicit Community structure Does not obey Shrinking Diameters

Outline Introduction General patterns and generators Graph evolution – Observations Densification Power Law Shrinking Diameters Proposed explanation Community Guided Attachment Proposed graph generation model “Forest Fire” Model Conclusion

“Forest Fire” model – Wish List Want no explicit Community structure Shrinking diameters and: “Rich get richer” attachment process, to get heavy-tailed in-degrees “Copying” model, to lead to communities Community Guided Attachment, to produce Densification Power Law Animation with people connected New person Add hierarchy Connect Remove hierarchy

“Forest Fire” model – Intuition (1) How do authors identify references? Find first paper and cite it Follow a few citations, make citations Continue recursively From time to time use bibliographic tools (e.g. CiteSeer) and chase back-links

“Forest Fire” model – Intuition (2) How do people make friends in a new environment? Find first a person and make friends Follow a of his friends Continue recursively From time to time get introduced to his friends Forest Fire model imitates exactly this process

“Forest Fire” – the Model A node arrives Randomly chooses an “ambassador” Starts burning nodes (with probability p) and adds links to burned nodes “Fire” spreads recursively End the red line

Forest Fire in Action (1) Forest Fire generates graphs that Densify and have Shrinking Diameter E(t) densification diameter 1.21 diameter N(t) N(t)

Forest Fire in Action (2) Forest Fire also generates graphs with heavy-tailed degree distribution in-degree out-degree Label the axis count vs. in-degree count vs. out-degree

Forest Fire model – Justification Densification Power Law: Similar to Community Guided Attachment The probability of linking decays exponentially with the distance – Densification Power Law Power law out-degrees: From time to time we get large fires Power law in-degrees: The fire is more likely to burn hubs

Forest Fire model – Justification Communities: Newcomer copies neighbors’ links Shrinking diameter

Conclusion (1) We study evolution of graphs over time We discover: Densification Power Law Shrinking Diameters Propose explanation: Community Guided Attachment leads to Densification Power Law

Conclusion (2) Proposed Forest Fire Model uses only 2 parameters to generate realistic graphs: Heavy-tailed in- and out-degrees Densification Power Law Shrinking diameter   

Thank you! Questions? jure@cs.cmu.edu

Dynamic Community Guided Attachment The community tree grows At each iteration a new level of nodes gets added New nodes create links among themselves as well as to the existing nodes in the hierarchy Based on the value of parameter c we get: Densification with heavy-tailed in-degrees Constant average degree and heavy-tailed in-degrees Constant in- and out-degrees But: Community Guided Attachment still does not obey the shrinking diameter property

Densification Power Law (1) Theorem: Community Guided Attachment random graph model, the expected out-degree of a node is proportional to

Forest Fire – the Model 2 parameters: Nodes arrive one at a time p … forward burning probability r … backward burning ratio Nodes arrive one at a time New node v attaches to a random node – the ambassador Then v begins burning ambassador’s neighbors: Burn X links, where X is binomially distributed Choose in-links with probability r times less than out-links Fire spreads recursively Node v attaches to all nodes that got burned

Forest Fire – Phase plots Exploring the Forest Fire parameter space Dense graph Increasing diameter Sparse graph Shrinking diameter

Forest Fire – Extensions Orphans: isolated nodes that eventually get connected into the network Example: citation networks Orphans can be created in two ways: start the Forest Fire model with a group of nodes new node can create no links Diameter decreases even faster Multiple ambassadors: Example: following paper citations from different fields Faster decrease of diameter

Densification and Shrinking Diameter Are the Densification and Shrinking Diameter two different observations of the same phenomena? No! Forest Fire can generate: (1) Sparse graphs with increasing diameter Sparse graphs with decreasing diameter (2) Dense graphs with decreasing diameter 1 2