# Biological networks and statistical physics

## Presentation on theme: "Biological networks and statistical physics"— Presentation transcript:

Biological networks and statistical physics
Diego Garlaschelli Dipartimento di Fisica, Università di Siena, ITALY Said Business School, University of Oxford, UK BioPhys09, Arcidosso, ITALY

from cells to ecosystems
Biological networks: from cells to ecosystems

Metabolic networks Vertices = cellular substrates (products or educts)
Links = biochemical reactions (enzyme-mediated) complex educt enzyme product educt (part of E. coli’s metabolic network )

Protein-protein interaction networks
Vertices = proteins Links = interactions within the cell

web of synaptic connections
Neural networks Vertices = neurons Links = synapses ← single neuron web of synaptic connections

Vascular networks Vertices = tissues Links = blood vessels 6 7 5 3 8 4
2 1

Ecological networks (food webs)
Vertices = coexisting species Links = predator-prey interactions

Real networks versus regular graphs
Protein-protein interaction network (Saccharomyces cerevisiae) Regular graphs Two problems: 1) characterization of network structure (and complexity) 2) network modelling

Graph Theory Directed Graph Undirected Graph corresponds to
j corresponds to “Graph”≡ G(V,E) V: N vertices E: L links Adjacency Matrix: i j Degree (number of links) of vertex i Average vertex-vertex distance: Clustering coefficient:

Small-world character of (most) real networks:
Short mean distance D: “it’s a small world, after all!” Efficient information transport (and fast disease spreading too!) Large clustering coefficient C: “my friends are friends of each other” High robustness under vertex removal

Degree distribution in (most) real networks:
Power-law distribution P(k)  k - 2< <3 No characteristic scale (scale-free)! Many poorly connected vertices Few highly connected vertices (a) Archaeoglobus fulgidus (archea); (b) E. coli (bacterium); (c) Caenorhabditis elegans (eukaryote); (d) 43 different organisms together.

Finite-scale versus scale-free networks
Finite-scale networks: P(k) decays exponentially No vertex has a degree much larger than the average value Scale-free networks: P(k) decays as a power law Few vertices have a degree much larger that the average value

Finite-scale versus scale-free networks
Finite-scale networks: P(k) decays exponentially Scale-free networks: P(k) decays as a power law (in both cases N=130 and L=215: same average degree) 5 vertices with largest degree vertices connected to the red ones (random 27%, scale-free 60%) other vertices

RANDOM GRAPH model (Erdös, Renyi 1959)
● Start with a set of N isolated vertices; ● For each pair of vertices draw a link with uniform probability p. p=0 p=0.1 Degree distribution P(k): (Poisson) p=0.5 p=1 Average vertex-vertex distance: Clustering coefficient

Connected components in random graphs
The interesting feature of the random graph model is the presence of a critical probability pc marking the appearance of a giant cluster: Percolation threshold pc  1/N When p<pc the network is made of many small clusters and P(s) decays exponentially; when p>pc there are few very small clusters and one giant one; at p=pc the cluster size distribution has a power-law form: P(s)  s -

SMALL-WORLD model (Watts, Strogatz Nature 1998)
● Start with a regular d-dimensional lattice, connected up to q nearest neighbours; ● With probability p, an end of each link is rewired to a new randomly chosen vertex. p = <p< p = 1 Regular Small-world Random P(k) 10 -1 10 -2 10 -3 10 -4 k Degree distribution C(p)/C(0) D(p)/D(0) small-world regime Average distance and clustering coefficient

SCALE-FREE model (Barabási, Albert Science 1999)
● Start with m0 vertices and no link; ● at each timestep add a a new vertex with m links, connected to preexisting vertices chosen randomly with probability proportional to their degree k (preferential attachment). P(k) k -  =3 After a certain number of iterations, the degree distribution approaches a power-law distribution: Growth and preferential attachment are both necessary!

distributions are obtained by
FITNESS model (Caldarelli et al. Phys. Rev. Lett. 2002) ● Each vertex i is assigned a fitness value xi drawn from a given distribution r(x) ; ● A link is drawn between each pair of vertices i and j with probability f(xi,xj) depending on xi and xj . Power-law degree distributions are obtained by chosing r(x)  x-α f(xi,xj)  xi xj or r(x)= e-x f(xi,xj)  (xi +xj –z)

Exponential random graphs

Reciprocity of directed networks

Important aspect of many networks:
Link reciprocity: the problem Do reciprocated links (pairs of mutual links between two vertices) occur more or less often than expected by chance in a directed network? Adjacency matrix (NxN): 1 2 6 4 5 3 Important aspect of many networks: Mutuality of relationships (friendship, acquaintance, etc.) in social networks Reversibility of biochemical reactions in cellular networks Symbiosis in food webs Synonymy in word association networks Economic/financial interdependence in trade/shareholding networks

Standard definition of reciprocity
Reciprocity = fraction of reciprocated links in the network Total number of directed links: reciprocity Number of reciprocated links: ( and WWW) (WTW)

A new definition of reciprocity
Conceptual problems with the standard definition: is not an absolute quantity, to be compared to - as a consequence, networks with different density cannot be compared - self-loops should be excluded when computing and New definition of reciprocity: correlation coefficient between reciprocal links reciprocal areciprocal antireciprocal avoiding the aforementioned problems. D. Garlaschelli, M.I. Loffredo Phys. Rev. Lett.93,268701(2004)

D. Garlaschelli, M.I. Loffredo
Results: reciprocity classifies real networks WTW WWW Neural Words Metabolic Financial Food Webs D. Garlaschelli, M.I. Loffredo Phys. Rev. Lett.93,268701(2004)

Size dependence of the reciprocity
Metabolic networks Food Webs World Trade Web

A general model of reciprocity
We introduce a multi-species formalism where reciprocated and non-reciprocated links are regarded as two different ‘chemical species’, each governed by the corresponding chemical potential ( and ) ‘particles’ of type distributed among ‘states’ Decomposition of the adjacency matrix: where Graph Hamiltonian: • Garlaschelli and Loffredo, PHYSICAL REVIEW E 73, (R) 2006

A general model of reciprocity
Grand Partition Function: Grand Potential: Occupation probabilities: Conditional connection probability:

Models of weighted networks

Structural correlations in complex networks
In order to detect patterns in networks, one needs (one or more) null model(s) as a reference. A null model is obtained by fixing some topological constraint(s), and generating a maximally random network consistent with them. Examples of null models for unweighted networks: -the random graph (Erdos-Renyi) model (number of links fixed), -the configuration model (degree sequence fixed), -etc. Problem of structural correlations: When a low-level constraint is fixed, patterns may be generated at a higher level, even if they do not signal ‘true’ high-level correlations.

The (solved) problem for unweighted networks
Problem: specifying the degree sequence alone generates anticorrelations between knni and ki (disassortativity) and between ci and ki (hierarchy). Maslov et al. Solution: in unweighted networks, structural correlations can be fully characterized analytically in terms of exponential random graphs: Park & Newman Correct prediction:

Some null models for weighted networks
Model 1: Global weight reshuffling (fixed topology) Model 2: Global weight & tie reshuffling (fixed degrees) Model 3: Local weighted rewiring (fixed strengths) Model 4: Local weighted rewiring (fixed strengths and degrees) Is it possible to characterize these models analytically?

Exponential formulation of the four null models
Model 1: Global weight reshuffling (fixed topology) Model 2: Global weight & tie reshuffling (fixed degrees) Model 3: Local weighted rewiring (fixed strengths) Model 4: Local weighted rewiring (fixed strengths and degrees) Note: H1, H2, H3 and H4 are particular cases of:

Analytic solution of the general null model:
Solution: the probability of a link of weight w between i and j is

(but they inherit purely topological correlations!)
Models 1 and 2 (global weight reshuffling): Fermionic correlations The expectations are confirmed, however implies This means that weighted measures (except the disparity) display a satisfactory behaviour under these null models (but they inherit purely topological correlations!)

Now all weighted measures are uninformative!
Model 3 (fixed strength): Bosonic correlations Now all weighted measures are uninformative!

All weighted measures are uninformative in this case too!
Model 4 (fixed strength+degree): mixed Bose-Fermi statistics We still have as in model 3: All weighted measures are uninformative in this case too!

the Weighted Random Graph (WRG) model
Particular case: the Weighted Random Graph (WRG) model See a Mathematica demonstration of the model (by T. Squartini) at:

The Weighted Random Graph (WRG) model

The Weighted Random Graph (WRG) model

Largest connected component in the WRG
after weak (+) and strong (-) edge removal

Clustering coefficient in the WRG
after weak (+) and strong (-) edge removal

Food webs

Networks of predation relationships among N biological species
Food webs Networks of predation relationships among N biological species i is eaten by j i j

Peculiar (problematic?) aspects of food webs
P>(k’) k’=k/<k> Not scale-free! C/Crandom C/CrandomN Not small-world! C/Crandom=1 N The connectance c=L/N2 varies across different webs (fraction of directed links out of the total possible ones) Only property similar to other networks: small distance D Dunne, Williams, Martinez Proc. Natl. Acad. Sci. USA 2002

A modest proposal: food webs as transportation networks
Resource transfer along each food chain: Flux of matter and energy form prey to predators, in more and more complex forms: directionality Species ultimately feed on the abiotic resources (light, water, chemicals): connectedness Almost 10% of the resources are transferred from the prey to the predator: energy dispersion

Minimum-energy subgraphs: minimum spanning trees
Minimum spanning trees can be obtained as zero-temperature ensembles where li is the trophic level (shortest distance to abiotic resources) of species i

Spanning trees and allometric scaling
Structure minimizing each species’ distance from the “environment vertex” Allometric relations: Ci (Ai) → C (A) A C(A) Trophic level ℓ of a species i: minimum distance from the environment to i. ℓ= Ai Ci Spanning tree: all links from a species at level ℓ to species at levels ℓ’≤ℓ are removed. Power-law scaling: C(A) Aη

Ai = drainage area of site i Ci = water in the basin of i
Allometric scaling in river networks C(A) Aη η = 3/2 Ai = drainage area of site i Ci = water in the basin of i Banavar, Maritan, Rinaldo Nature 1999

Allometric scaling in vascular systems
C(A) Aη η = 4/3 Kleiber’s law of metabolism: B(M) M 3/4 A0= metabolic rate (B) C0= nutrient volume (M) General case (dimension d): η = (d+1)/d maximum efficiency West, Brown, Enquist Science 1999; Banavar, Maritan, Rinaldo Nature 1999

Allometric scaling in food webs
The resource transfer is universal and efficient (common organising principle?) C(A) Aη η = Garlaschelli, Caldarelli, Pietronero Nature 423, (2003)

Transport efficiency in food webs
The constraint limiting the efficiency is not the geometry, but the competition! C(A) A2 chain inefficient C(A) Aη 1<η<2 competition C(A) A star efficient

Summary: food web structure decomposition
Spanning trees and loops: complementary properties and roles Tree-forming links: 1) Determine the degree of transportation EFFICIENCY 2) Measured by the allometric exponent η 3) η is universal! (Common evolutionary principle?) Loop-forming links: 1) Determine the STABILITY under species removal 2) Measured by the directed connectance c 3) c varies! (Web-specific organization?) Source Species

Out-of-equilibrium statistical mechanics of networks

Restoring the feedback
We focus on the case when topology and dynamics evolve over comparable timescales: Dynamical process Topological evolution As a result, the process is self-organized and a non-equilibrium stationary state is reached, independently of (otherwise arbitrary) initial conditions. We choose the simplest possible dynamical rule: Bak-Sneppen model and the simplest possible network formation mechanism: Fitness model

Coupling the Bak-Sneppen and the fitness model
Bak-Sneppen model on fixed graphs (Bak, Sneppen PRL 1993 – Flyvbjerg, Sneppen, Bak PRL 1993 – Kulkarni, Almaas, Stroud cond-mat/ – Moreno, Vazquez EPL Lee, Kim PRE Masuda, Goh, Kahng PRE 2005) 1) Specify graph, and keep it fixed; 2) assign each vertex i a fitness xi drawn uniformly in (0,1); 3) draw anew fitnesses of least fit vertex and its neighbours; 4) evolve fitnesses iterating 3). Fitness network model with quenched fitnesses (Caldarelli et al. PRL 2002 – Boguna, Pastor-Satorras PRE 2003) 1) Specify fitness distribution (x); 2) assign each vertex i a fitness xi drawn from (x), and keep it fixed; 3) draw network by joining i and j with probability f(xi, xj); 4) repeat realizations and perform ensemble average. Coupled (Self-organized) model: 1) Assign each vertex i a fitness xi drawn from what you like; 2) draw network by joining i and j with probability f(xi, xj); 3) draw anew fitnesses of least fit vertex and its neighbours, uniformly in (0,1); 4) draw anew links of least fit vertex and its neighbours with probability f(xi, xj); 5) repeat from 3).

Typical iteration of the model:

Analytical solution for arbitrary f(x,y)
Stationary fitness distribution: uniform, as in standard BS novel result: depends on x (not uniform) Distribution of minimum fitness: uniform Critical threshold  obtained from normalization condition: D. Garlaschelli, A. Capocci, G. Caldarelli, Nature Physics 3, (2007)

Analytical solution for arbitrary f(x,y)
Degree versus fitness: Stationary degree distribution: Similarly, all other topological properties are derived as in the static fitness model

Particular choices of f(x,y)
Null case: random graph (“grandcanonically” equivalent to random-neighbor BS model) Stationary fitness distribution: Step-like, as in random-neighbor BS model (if sparse) Critical threshold: subcritical sparse dense dynamical regimes rooted in an underlying percolation transition, located at

Particular choices of f(x,y)
Simplest nontrivial (and unbiased) case: configuration model see Garlaschelli and Loffredo, Phys. Rev. E 78, (R) (2008). Stationary fitness distribution: Zipf (but normalizable!) Critical threshold: subcritical sparse dense conjecture (verified later): underlying percolation transition, located at

Stationary fitness distribution
In the self-organized model, it is no longer step-like (as in the BS model on fitness-independent networks) but power-law:

Theoretical results against simulations
Power-law fitness distribution (above ):

Check the percolation transition conjecture
Power-law cluster size distribution at the transition

Check the percolation transition conjecture

Degree versus fitness The “saturation” reflects repulsion between large degrees: implies disassortativity and hierarchy (not shown)

Cumulative degree distribution
Scale-free degree distribution (above )

Average fitness versus threshold

References Reciprocity Weighted networks Food web scaling
D. Garlaschelli, M. I. Loffredo, Phys. Rev. Lett. 93, (2004) D. Garlaschelli, M. I. Loffredo, Phys. Rev. E 73, (R) (2006) Weighted networks D. Garlaschelli, M.I. Loffredo, Phys. Rev. Lett. 102, (2009) D. Garlaschelli, New Journal of Physics 11, (2009) Food web scaling D. Garlaschelli, G. Caldarelli, L. Pietronero, Nature 423, (2003) D. Garlaschelli, Eur. Phys. J. B 38(2), 277 (2004) Out-of-equilibrium model D. Garlaschelli, A. Capocci, G. Caldarelli, Nature Physics 3, (2007) G. Caldarelli, A. Capocci, D. Garlaschelli, Eur. Phys. J. B 64, (2008)