Presentation on theme: "Scaling, renormalization and self- similarity in complex networks Chaoming Song (CCNY) Lazaros Gallos (CCNY) Shlomo Havlin (Bar-Ilan, Israel) Hernan A."— Presentation transcript:
Scaling, renormalization and self- similarity in complex networks Chaoming Song (CCNY) Lazaros Gallos (CCNY) Shlomo Havlin (Bar-Ilan, Israel) Hernan A. Makse Levich Institute and Physics Dept. City College of New York Protein interaction network
Are scale-free networks really free-of-scale? If you had asked me yesterday, I would have said surely not - said Barabasi. (Science News, February 2, 2005). Small world contradicts self-similarity! Small World effect shows that distance between nodes grows logarithmically with N (the network size): OR Self-similar = fractal topology is defined by a power-law relation: AIM: How the network behaves under a scale transformation. Implications for: 1. Dynamics 2. Modularity 3. Universality
Internet connectivity, with selected backbone ISPs (Internet Service Provider) colored separately. Faloutsos et al., SIGCOMM 99 Internet
J. Han et al., Nature (2004) Yeast Protein-Protein Interaction Map Individual proteins Physical interactions from the filtered yeast interactome database: 2493 high-confidence interactions observed by at least two methods (yeast two- hybrid). 1379 proteins, = 3.6 Colored according to protein function in the cell: Transcription, Translation, Transcription control, Protein-fate, Genome maintenance, Metabolism, Unknown, etc Modular structure according to function! from MIPS database, mips.gsf.de
Metabolic network of biochemical reactions in E.coli Chemical substrates Biochemical interactions: enzyme-catalyzed reactions that transform one metabolite into another. Modular structure according to the biochemical class of the metabolic products of the organism. Colored according to product class: Lipids, essential elements, protein, peptides and amino acids, coenzymes and prosthetic groups, carbohydrates, nucleotides and nucleic acids. J. Jeong, et al., Nature, 407 651 (2000)
Biological networks Protein Homology Tree of life Similarities between sequence of Amino-acids (BLAST) Network of 5 million proteins 1.2 TB of data growing at 50GB Per month. Adai et al. J Mol Biol (2004) Complex network of species Representing their evolucionary history ~90,000 species
Coast linesRiversMountains CloudsLightening Neurons Introduction to fractals In Nature there exist many examples of random fractals
How long is the coastline of Norway? It depends on the length of your ruler. Fractal Dimension d B - Box Covering Method Fractals look the same on all scales = `scale-invariant. Box length Total no. of boxes
Boxing in Biology How to zoom out of a complex network? Generate boxes where all nodes are within a distance Calculate number of boxes,, of size needed to cover the network We need the minimum number of boxes: NP-complete optimization problem! Boxing in Biology
Most efficient tiling of the network 4 boxes 5 boxes 1 0 0 1 2 8 node network: Easy to solve 300,000 node network: Mapping to graph colouring problem. NP-complete: Greedy algorithm to find minimum boxes
Burning algorithms 1. Compact box burning: CBB Song et al. JSTAT (2007) 2. Maximum mass burning: MEMB Burning from the hubs with the radius r Minimazing the number of boxes is analogous to maximizing the mass of each box: implications for modularity
Two universality classes: -dB-dB log (l B ) log (N B ) 1 2 3 TOPOLOGICAL NON-FRACTALTOPOLOGICAL FRACTALS EUCLIDIAN NON-FRACTALS EUCLIDIAN FRACTALS Percolation cluster: holes at all scales Compact cluster
Box covering in yeast: protein interaction network
Many complex networks are Fractal Metabolic Protein interaction Song, Havlin, Makse, Nature (2005) Biological networks Three domains of life: archaea, bacteria, eukaria E. coli, H. sapiens, yeast 43 organisms - all scale yeast
More topological fractals WWW nd.edu domain 1. Protein homology network 2. Tree of life (taxonomy) 3. Genetic networks (Meyer- Ortmanns, Khang) 4. Neural networks (Yuste) 300,000 web-pages
Internet and social networks are not fractal Other models fail too: Erdos-Renyi, hierarchical model, fitness model, JKK model, pseudo-fractals models, etc. The Barabasi-Albert model of preferential attachment does not generate fractal networks All available models fail to predict self-similarity INTERNET Router and AS level
Two universality classes Fractal networks: WWW Biological networks: protein interactions, metabolic, genetic (Meyer-Ortmanns, Khang), taxonomy, tree of life, protein homology network, neural activity network. Non-Fractal networks: Internet (routers and AS level) Social networks (citations (Khang), IMDB) Models based on uncorrelated preferential attachment
Two ways to calculate fractal dimensions Box covering method Cluster growing method In homogeneous systems (all nodes with similar k) both definitions agree: percolation
Box Covering= flat averageCluster Growing = biased power law Different methods yield different results due to heterogeneous topology exponential Box covering reveals the self similarity. Cluster growth reveals the small world. NO CONTRADICTION! SAME HUBS ARE USED MANY TIMES IN CG.
Renormalization in Complex Networks NOW, REGARD EACH BOX AS A SINGLE NODE AND ASK WHAT IS THE DEGREE DISRIBUTION OF THE NETWORK OF BOXES AT DIFFERENT SCALES ?
Statistical properties are invariant under renormalization WWWPIN E.coli Internet Self-similarity: Invariant under renormalization Internet is not fractal, d B --> infinity but it is renormalizable FRACTALS NON-FRACTALS
DYNAMICS: Turning back the time Repeatedly BOXING the network is the same as going back in time: from a single node to present day. renormalization time evolution Can we predict the past…. ? if not the future. ancestral node present day network THE RENORMALIZATION SCHEME 1
time evolution Evolution of complex networks opening boxes
How does Modularity arise? The boxes have a physical meaning = self-similar nested communities time evolution ancestral node present day network renormalization 1 How to identify communities in complex networks?
Is evolution of PIN fractal? Ancestral Prokaryote Cell Yeast Other Fungi Ancestral yeast Animals + Plants Ancestral Fungus Archaea + Bacteria Ancestral Eukaryote present day ~ 300 million years ago 1 billion years ago 1.5 billion years ago Following the phylogenetic tree of life: 3.5 billion years ago COG database von Mering, et al Nature (2002)
Suggests that present-day networks could have been created following a self-similar, fractal dynamics. Same fractal dimension and scale-free exponent over 3.5 billion years…
Renormalization following the phylogenetic tree P. Uetz, et al. Nature 403 (2000).
Emergence of Modularity in PIN Boxes are related to the biologically relevant functional modules in the yeast protein interactome time evolution renormalization present day network translationtranscriptionprotein-fate cellular-fate organization ancestral cell
Emergence of modularity in metabolic networks Appearance of functional modules in E. coli metabolic network. Most robust network than non-fractals.
Scaling theory of modularity How the modules/communities are linked? k: degree of the nodes k=2 renormalization s=1/4 k=8 k: degree of the communities node degree community degree factor<1 Gallos et al. PNAS (2007)
Theoretical approach to modular networks: Scaling theory to the rescue WWW The larger the module the smaller their connectivity new exponent describing how modules link
Scaling relations A theoretical prediction relating the different exponents new scaling relation boxes distance degree new exponent
Scaling relations The communities also follow a self-similar pattern WWW Metabolic Scaling relation works fractalscommunities/modules scale-free prediction
What is the origin of topological fractality? HINT: the key to understand fractals is in the degree correlations P(k 1,k 2 ) not in P(k) Can you see the difference? Internet map Yeast protein map E.coli metabolic map NON FRACTAL FRACTAL Compact cluster
Quantifying correlations P(k 1,k 2 ): Probability to find a node with k 1 links connected with a node of k 2 links Internet map - non fractalMetabolic map - fractal log(k 1 ) log(k 2 ) P(k 1,k 2 ) low prob. high prob. Hubs connected with hubs Hubs connected with non-hubs Gallos et al. (2007)
Quantify anticorrelation between hubs at all length scales hubs Renormalize Hubs connected directly Hub-Hub Correlation function: fraction of hub-hub connections
Hub-hub correlations organized in a self-similar way The larger d e implies more anticorrelations (fractal) (non-fractal) Anticorrelations are essential for fractal structures non-fractal fractal Exponent d e determines the joint probability distribution
What is the origin of fractality? very compact networks hubs connected with other hubs strong hub-hub attraction assortativity Non-fractal networks less compact networks hubs connected with non-hubs strong hub-hub repulsion dissasortativity Fractal networks Internet, social All available models: BA model, hierarchical random scale free, JKK, etc WWW, PIN, metabolic, genetic, neural networks, protein homology, taxonomy
How to model it? renormalization reverses time evolution Mode IIMode I time Both mass and degree increase exponentially with time Scale-free: offspring nodes attached to their parents (m=2) in this case renormalize Song, Havlin, Makse, Nature Physics, 2006
How does the length increase with time? Mode II: FRACTAL Mode I: NONFRACTAL SMALL WORLD
Combine two modes together time e=0.5 Mode I with probability e Mode II with probability 1-e renormalize
m = 2 Model A multiplicative growth process of the number of nodes and links Probability e hubs always connected strong hub attraction should lead to non-fractal Probability 1-e hubs never connected strong hub repulsion should lead to fractal Analogous to duplication/divergence mechanism in proteins??
For the both models, each step the total number of nodes scale as n = 2m +1( N(t+1) = nN(t) ). Now we investigate the transformation of the lengths. They show quite different ways for this two models as following: Then we lead to two different scaling law of N ~ L Mode III: L(t+1) =3L(t) Mode II: L(t+1) = 2L(t)+1 Mode I: L(t+1) = L(t)+2 smaller Different growth modes lead to different topologies
Suppose we have e probability to have mode I, 1-e probability to have mode II and mode III. Then we have: or Dynamical model
Model predicts all exponents in terms of growth rates Each step the total mass scales with a constant n, all the degrees scale with a constant s. The length scales with a constant a, we obtain: We predict the fractal exponents:
Predictions Model reproduces local small world, scale-free and fractality NON-FRACTAL attraction between hubs non-fractal small world globally FRACTAL repulsion between hubs leads to fractal topology small world locally inside well defined communities yeast
The model reproduces the main features of real networks Case 1: e = 0.8: FRACTALS Case 2: e = 1.0: NON-FRACTALS
Summary of scaling exponents and scaling relations Mass: Links: Hub-hub correlations: Modularity ratio Modularity exponent : Number of hub-hub links Number of links outside modules Number of links inside modules
Modularity is also scale-invariant Protein Homology Similarities between sequence of amino-acids (BLAST) Network of 5 million proteins 1.2 TB of data growing at 50GB per month. Adai et al. J Mol Biol (2004) Yeast protein interaction Large modularity Ultramodularity
Multiplicative and exponential growth in yeast PIN Length-scales, number of conserved proteins and degree
Self-similar learning dynamics of the brain Calcium imaging of spontaneous action potentials in large neuronal populations of a slice of the medial prefrontal cortex of a b rain slice of mouse. Rafael Yuste and Ikegaya John Cage minimalist avant-garde music
t = 15 sect = 30 sect = 45 sect = 60 sec t = 75 sect = 90 sect = 105 sect = 120 sec Time evolution of the network
The degree distribution P(k) is invariant under evolution. The plots go from 30 sec to 120 sec The fractal dimension is also invariant under evolution from 30 sec to 120 sec The degree exponents and fractal dimension are invariant under the time evolution
Scale-transformation of degree We verify the formula: k(t1) = S(t1|t2) k(t2) Here we fix t2 = 120 sec, and take t1 from 30 sec to 105 sec. The linear dependency is verified for different times t1.
From the theory: N(t) = s(t) The inset shows that both N(t) and s(t) increase exponentially: N(t) ~ exp(0.014t) s(t) ~ exp(0.021t) This gives rise to the following scaling relation: Confirmation of the scaling formula for the degree exponent as a function of the fractal exponents
Tolerance of the network under random failure and intentional attack We plot the largest cluster size as a function of the fraction p of nodes removed
A new principle of network dynamics 1930 solid-state physics big world 1960 Erdos-Renyi model small world democracy= socialism 1999 BA model rich-get-richer= capitalism 2005 fractal model of modularity rich-get-richer at the expense of the poor= globalization Less vulnerable to intentional attacks: Designed by Evolutionary pressure.
Summary In contrast to common belief, many real world networks are self- similar. FRACTALS: WWW, Protein interactions, metabolic networks, neural networks, homology networks, tree of life. NON-FRACTALS: Internet, social, all models. Communities/modules are self-similar, as well. Scaling theory describes the dynamical evolution. Boxes are related to the functional modules in metabolic and protein networks. Origin of self similarity: anticorrelation between hubs Fractal networks are less vulnerable than non-fractal networks
Graph theoretical representation of a metabolic network (a) A (a) A pathway (catalyzed by Mg 2+ -dependant enzymes). (b) All interacting metabolites are considered equally. (c) For many biological applications it is useful to ignore co-factors, such as the high energy-phosphate donor ATP, which results in a second type of mapping that connects only the main source metabolites to the main products.
More topological fractals WWW nd.edu domain Hollywood film actors 212,000 actors 300,000 web-pages
Burning algorithms Compact box burning: CBB Song et al. JSTAT (2007) Maximum excluded mass burning: MEMB Burning from the hubs with the radius r Minimazing the number of boxes is analogous to maximizing the mass of each box: Modularity