Models of networks (synthetic networks or generative models) Prof. Ralucca Gera, Applied Mathematics Dept. Naval Postgraduate School Monterey, California rgera@nps.edu Excellence Through Knowledge
Identify network models and explain their structures; Learning Outcomes Identify network models and explain their structures; Contrast networks and synthetic models; Understand how to design new network models (based on the existing ones and on the collected data) Distinguish methodologies used in analyzing networks.
The world around us as a network What do social networks look like? Watch this video Synthetic models are used as reference/null models to compare and understand the structure of complex networks: E-R Random networks (normal degree distribution) Scale free (power-law degree distribution) Small world Video: https://www.youtube.com/watch?v=QUWds9gt6aE
The three papers for each of the models “On Random Graphs I” by Paul Erdős and Alfed Renyi in Publicationes Mathematicae (1958) Times cited: ∼ 3, 517 (as of January 1, 2015) “Collective dynamics of ‘small-world’ networks” by Duncan Watts and Steve Strogatz in Nature, (1998) Times cited: ∼ 24, 535 (as of January 1, 2015) “Emergence of scaling in random networks” by László Barabási and Réka Albert in Science, (1999) Times cited: ∼ 21, 418 (as of January 1, 2015)
Create networks of different sizes Why care? Epidemiology: A virus propagates much faster in scale-free networks. Vaccination of random nodes in scale free does not work, but targeted vaccination is very effective Create synthetic networks to be used as null models: What effect does the degree distribution alone have on the behavior of the system? (answered by comparing to the configuration model) Create networks of different sizes Networks of particular sizes and structures can be quickly and cheaply generated, instead of collecting and cleaning the data that takes time
Reference network: Regular Lattice The 1-dimensional lattice is the Harary graph H(n,r) or the Circulant graph 𝐶 𝑛 (1, 2, …, r) start with an n-cycle, and each vertex is adjacent to r/2 vertices to the left, and r/2 vertices to the right. Source: http://mathworld.wolfram.com/CirculantGraph.html
Reference network: Regular Lattice a particular Circulant graph 𝐶 𝑛 (1, 2, …, r): Source: http://mathworld.wolfram.com/CirculantGraph.html Source: http://mathworld.wolfram.com/CirculantGraph.html
Reference network: Regular Lattice The higher dimensions are generalizations of these. An example is a hexagonal lattice is a 2-dimensional lattice: graphene, a single layer of carbon atoms with a honeycomb lattice structure. Source: http://phys.org/news/2013-05-intriguing-state-previously-graphene-like-materials.html
Erdős-Rényi Random Graphs (1959)
Random graphs (Erdős-Rényi , 1959) ERmodel : created at random with fixed parameters G(n, m): fix n (node count) and m (edge count) G(n,p): fix n and probability p of the edge existence between vertices (m is not fixed) The mean value of edges: 𝑚= 𝑛 2 𝑝= 𝑛 𝑛−1 𝑝 2 The average degree 𝑘 = 𝑛−1 𝑝 The distribution of finding a node of degree 𝑘 is binomial: 𝑃 𝑘 = 𝑛−1 𝑘 𝑝 𝑘 1−𝑝 𝑛−1−𝑘 Constructing using Gephi need Gephi’s plug-in. NetworkX has more synthetic models and classes
To make a random network 𝐺(𝑛,𝑚): Creating G(n,m) To make a random network 𝐺(𝑛,𝑚): take n nodes, m unlabeled edges randomly placed between the n vertices Put the graph in a box, make another one and put it in the box, and another one… Pull one network at random out of the box and it will have a Normal Degree Distribution (classic degree distribution): almost everyone has the same number of friends on average
Creating G(n,m) – method 2 Method two and equivalent to the first: To make a random network 𝐺(𝑛,𝑚): take n nodes, m pairs of nodes at random to form edges, place the edges between the randomly chosen nodes. The average degree: <k> = 2𝑚 𝑛 , where 𝑘 𝑖 is often used to denote the degree of vertex i in complex networks (enumerate the vertices, 1, 2, …)
To create a random network 𝐺(𝑛,𝑝): Creating G(n,p) To create a random network 𝐺(𝑛,𝑝): take n nodes, A fixed probability 𝑝 for the whole graph Attach edges at random to the nodes, with the probability p Degree distribution for both for 𝐺(𝑛,𝑚) and 𝐺(𝑛,𝑝)
Results about E-R graphs: Degree distribution: Binomial Average path is small compared to n: ln 𝑛 ln ( 𝑘 𝑖 ) , where 𝑘 𝑖 is the average degree Comparable to the ln 𝑛 of the observed networks Clustering coefficient is small: 𝑝= 𝑘 𝑖 𝑛 (The probability that two neighbors of a node are connected is equal to the probability of any two random nodes being connected) However observed networks have high clustering.
Generating Erdős-Rényi ER(n,p) ER graphs are models of a network in which some specific set of parameters take fixed values, but the construction of the network is random (see below in Gephi)
Generating Erdős-Rényi ER(n,m)
Generating Erdős-Rényi random networks Reference for python: http://networkx.lanl.gov/reference/generated/networkx.generators.random_graphs.erdos_renyi_graph.html#networkx.generators.random_graphs.erdos_renyi_graph
The Random Geometric model
Random Geometric Model Again the connections are created at random, but based on proximity (such as ad hoc networks) Proximity is relevant: for each node 𝑥, the edge 𝑥 𝑦 𝑖 is created with a probability if 𝑑(𝑥,𝑦 𝑖 )≤𝑟, for given fixed distance r. There is no perfect model for the world around us, not even for specific types of networks
An example of a random geometric https://www.youtube.com/watch?v=NUisb1-INIE
Creating it in Python https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.generators.geometric.random_geometric_graph.html#networkx.generators.geometric.random_geometric_graph
The Malloy Reed Configuration model (1995)
The configuration model A random graph model created based on Degree sequence of choice (can be scale free) Maybe more than degree sequence is needed to be controlled in order to create realistic models
The MR configuration model A random graph model created based on a degree sequence of choice: 4, 3, 2, 2, 2, 1, 1, 1 Step 1: Step 2: Or this step 2:
Mathematical properties Let 𝑖 and 𝑗 be two nodes. Expectation of 𝑖𝑗 to be an edge : Pick an edge out of the m edges in G: the probability that the left end node is i is 𝑘 𝑖 (its degree), and the probability that the right end node is j, is 𝑘 𝑗 ), and so: p ij = k i k j 2𝑚 (used 2m since each edge is counted from each of its two ends) Expectation of a multi edge 𝑖𝑗 : Given that 𝑖𝑗∈𝐸 𝐺 , then the probability that it will be an edge again is p ij = (k i −1) (k j −1) 2𝑚 , and so the probability of both happening is p ij−𝑝𝑎𝑟𝑎𝑙𝑙𝑒𝑙 = k i k j 2𝑚 (k i −1) (k j −1) 2𝑚 = (k i 𝑘 𝑗 )(k i −1) (k j −1) 4 𝑚 2 which simplifies to:
Mathematical properties (parallel edges) Average degree: <𝑘> = 𝑖 𝑘 𝑖 𝑛 = 2𝑚 𝑛 , and the average of their squares: < 𝑘 2 > = 𝑘 1 2 + 𝑘 2 2 + …+ 𝑘 𝑛 2 𝑛 = 𝑖 𝑘 𝑖 2 𝑛 . Then, the expected number of parallel edges is: http://tuvalu.santafe.edu/~aaronc/courses/5352/csci5352_2017_L4.pdf
Mathematical properties (loops) 1. Recall that for parallel edges, p ij = k i k j 2𝑚 . Thus the expectation of a loop 𝑖𝑖: p 𝑖𝑖 = 𝑘 𝑖 (𝑘 𝑖 −1 ) 2𝑚 2. And the equation on the previous page simplifies to the expected number of loops being < 𝑘 2 > − <𝑘> 2<𝑘> Conclusion: Since the variables in the equation in 2. above are constant with respect to the size of the network, only a small fraction of edges are loops or parallel edges one edge of node 𝑖 has been used http://tuvalu.santafe.edu/~aaronc/courses/5352/csci5352_2017_L4.pdf
Generating it in Python https://networkx.github.io/documentation/networkx-1.10/reference/generated/networkx.generators.degree_seq.configuration_model.html
Part 2
Coding it in CoCalc Go to www.CoCalc.com and create an account using your NPS email Create your new folder to copy the code Open “MA4404-2019” folder to copy its contents to your new folder.
Copy contents to NEW folder
Make a copy Choose “CreateSyntheticNetworks.ipynb” Notice projects, folders & files
Create ER networks
Watts-Strogatz Small World Graphs (1998)
Small world models Duncan Watts and Steven Strogatz small world model: a few random links in an otherwise structured graph make the network a small world: the average shortest path is short regular lattice (one type of structure): my friend’s friend is always my friend small world: mostly structured with a few random connections random graph: all connections happen at random Source: Watts, D.J., Strogatz, S.H. (1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.
Small worlds, between order and chaos High clustering: .75 High average path: 𝑛 2 Low clustering: p (probability) Low average path: ln 𝑛 ln ( 𝑘 𝑖 ) Small worlds the graph on the left has order (probability p =0), the graph in the middle is a "small world" graph (0 < p < 1), the graph at the right is complete random (p=1). Source: http://www.bordalierinstitute.com/target1.html
Avg path <𝒍> and avg clustering <𝑪> Variations of avg path and clustering as a function of the rewiring probability p https://pdfs.semanticscholar.org/8c4c/455de44fa99e73e79d6fddf008ca6ae0f9aa.pdf
Generating Watts-Strogatz WS (n, k, alpha) Alpha is the rewiring probability
Generating Watts-Strogatz networks .15 is the rewiring probability http://networkx.lanl.gov/reference/generated/networkx.generators.random_graphs.watts_strogatz_graph.html#networkx.generators.random_graphs.watts_strogatz_graph
Barabási-Albert Scale free model (1999)
Network growth & resulting structure Random attachment: new node picks any existing node to attach to Preferential/fitness attachment: new node picks from existing nodes according to their degrees/fitness (high preference for high degree/fitness) http://projects.si.umich.edu/netlearn/NetLogo4/RAndPrefAttachment.html
Scale-free networks are a type of small world Whether static or evolutionary, they have A power-law degree distribution: 𝑝=𝐶 𝑘 −𝛼 , 𝑤ℎ𝑒𝑟𝑒 2≤𝛼≤3. Common ways to grow the network: Preferential attachment based on degree (for Barabási-Albert type the probability of attachment 𝑝 𝑢 = 𝑘 𝑢 𝑖 𝑘 𝑖 , where 𝑘 𝑖 is the degree of node 𝑖). Preferential attachment based on fitness (preassigned values).
Power law networks Many real world networks contain hubs: highly connected nodes Usually the distribution of edges is extremely skewed many nodes with small degree number of nodes of that degree No “typical” degree node fat tail: a few nodes with a very large degree Degree (number of edges)
But is it really a power-law? A power-law will appear as a straight line on a log-log plot: let 𝑝 𝑘 be the count of vertices of degree k. 𝑝 𝑘 =𝐶 𝑘 −𝛼 ln 𝑝 𝑘 =−𝛼 ln 𝑘 +𝑐 A deviation from a straight line could indicate a different distribution: exponential lognormal Log of number of nodes of that degree log of the degree
Fitting distributions Node (frame) and edge (inset) counts of European Airline Transportation Network's layers with distribution fitting. http://faculty.nps.edu/rgera/ANGEL.html
Fitting distributions European Airline Transportation Network's multilayer network: Degree histogram of the multiplexes with the log scale in the inset. Upper right: average shortest path, lower right: centrality coefficient, per node http://faculty.nps.edu/rgera/ANGEL.html
Network growth (measured by node count). Scale Free networks One example is introduced by Albert Laslo Barabási and Reka Albert (BA model) as a degree based preferential attachment : Start with a small set of nodes ( 𝑚 0 ) and random edges Attach new nodes one at the time; each with the same fixed number 𝑙 of new edges, attaching to the existing nodes in the network, with preference for high degrees (once the high degrees appear) https://www.youtube.com/watch?v=5YdkhWB_uYQ Network growth (measured by node count). Not the only way to get scale–free networks!
Generating Barabasi-Albert
Generating Barabasi-Albert networks http://networkx.lanl.gov/reference/generated/networkx.generators.random_graphs.barabasi_albert_graph.html#networkx.generators.random_graphs.barabasi_albert_graph
Many modifications of this model exists, based on: Modified BA Many modifications of this model exists, based on: Nodes “retiring” and losing their status/outdated Nodes disappearing (such as website going down) Links appearing or disappearing between the existing nodes (called internal links) Fitness of nodes (modeling newcomers like Google) Most researchers still use the standard BA model when studying new phenomena and metrics. It is a simple model (allows consistent research) that has growth and preferential attachment One can add more conditions to this basic model, in order to mimic reality
A zoo of complex networks
Random, Small-World, Scale-Free Scale Free networks: High degree heterogeneity Various levels of modularity Various levels of randomness Man made, “large world”: http://noduslabs.com/radar/types-networks-random-small-world-scale-free/
Newman “The structure and function of complex networks” (2003) Main References Newman “The structure and function of complex networks” (2003) Estrada “The structure of complex Networks” (2012) Barabasi “Network Science” (online: http://barabasi.com/networksciencebook/) References to the classes that exist in python: http://networkx.lanl.gov/reference/generators.html
Back to coding in CoCalc