Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biological Networks Feng Luo.

Similar presentations


Presentation on theme: "Biological Networks Feng Luo."— Presentation transcript:

1 Biological Networks Feng Luo

2 Copyright notice Many of the images in this power point presentation of other people. The Copyright belong to the original authors. Thanks!

3 Biological Networks Biological Systems Biological Networks
Made of many non-identical elements interact each other with diverse ways. Biological Networks Biological networks as framework for the study of biological systems

4 Why Study Networks? It is increasingly recognized that complex systems cannot be described in a reductionist view. Understanding the behavior of such systems starts with understanding the topology of the corresponding network. Topological information is fundamental in constructing realistic models for the function of the network. We saw the complexity and volume of data sets we are dealing with. It is impossible to analysis or manage their properties and underlying principles due to their complexity.

5 Shortest Path/Geodesic distance
Graph Terminology Node Edge Directed/Undirected Degree Shortest Path/Geodesic distance Neighborhood Subgraph Complete Graph Clique Degree Distribution Hubs

6 Type of Biological Networks
Protein interaction networks Gene regulatory networks Metabolism networks Gene co-expression networks Signal transduction networks Genetic interaction networks

7 P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …
Protein Interactions Y2H method to detect interaction Bait and Prey protein P. Uetz, et al. Nature, 2000; Ito et al., PNAS, 2001; …

8 Protein Interaction Network
Nodes: proteins Links: physical interactions (Jeong et al., 2001)

9

10 Metabolic network (KEGG)
Graph Node: Object e.g. Chemical compound Edge: Relation between objects e.g. Chemical reaction

11 Nodes: chemicals (substrates)
Links: chem. reaction Metabolic Network

12 Metabolic Network Nodes: chemicals (substrates) Links: chemistry reactions (Ravasz et al., 2002)

13 Gene Regulation Proteins are encoded by the DNA of the organism.
Proteins regulate expression of other proteins by interacting with the DNA protein protein protein Inducer (external signal) DNA promoter region ACCGTTGCAT Coding region

14 Activators increase gene production
X Y Activator X No transcription gene Y X binding site Y Y Y Y Sx X X* INCREASED TRANSCRIPTION X* Bound activator

15 Repressors decrease gene production
Bound repressor X Y Sx X X* No transcription X* Bound repressor Y Y Unbound repressor Y Y X

16 Gene Regulatory Networks
Nodes are proteins (or the genes that encode them) X Y

17 The gene regulatory network of E. coli
Shen-Orr et. al. Nature Genetics 2002 shallow network, few long cascades. modular compact in-degree, scale free outdegree (promoter size limitation)

18 Gene regulatory networks

19 CoExpression Network Revealed from Yeast Cell Cycle Data
1. Protein fate 2. Amino acid synthesis 3. Galactose metabolism 4. Protein glycosylation and transport Cell wall organization 5. Amino acid metabolism 6. Mating 7. Glucogenesis 8. unknown 9. Cell cycle regulation Y’-cluster Histone 11. Cell differentiation 12. Protein synthesis 10. Stress response 13. Cell wall 14. Energy transport 15. Ribosomal biogenesis Ribosomal proteins Mitochondrion Protein degradation Yeast cell cycle microarray data (Spellman et al., 1998)

20 Signal transduction networks
(BD BioScience) Elements inside same module often involved in same biological process. This separation can originate from spatial localization or from chemical specificity. Insulation allows the cell to carry out many diverse reactions without cross-talk that would harm the cell. Connectivity allows one function to influence another. Functional modules reflect the critical level of biological organization. A modular system can reuse existing, well-tested modules.

21 Properties of Biological Networks
Scale Free Small world Hierarchical Modular Robust Motif

22 Scale-Free Network Degree of a node P(k) Scale-free network
The number of adjacent nodes P(k) Degree distribution Frequency of nodes with degree k Scale-free network P(k) follows power law Different from random networks

23 Connect with probability p
Erdös-Rényi model (1960) Connect with probability p p=1/6 N=10 k ~ 1.5 Pál Erdös ( ) Poisson distribution - Democratic - Random

24 SCALE-FREE NETWORKS (1) The number of nodes (N) is NOT fixed.
Networks continuously expand by the addition of new nodes Examples: WWW : addition of new documents Citation : publication of new papers (2) The attachment is NOT uniform. A node is linked with higher probability to a node that already has a large number of links. Examples : WWW : new documents link to well known sites (CNN, YAHOO, NewYork Times, etc) Citation : well cited papers are more likely to be cited again

25 Scale-free model P(k) ~k-3
(1) GROWTH : At every timestep we add a new node with m edges (connected to the nodes already present in the system). (2) PREFERENTIAL ATTACHMENT : The probability Π that a new node will be connected to node i depends on the connectivity ki of that node P(k) ~k-3 A.-L.Barabási & R. Albert, Science, 1999

26 Metabolic network Archaea Bacteria Eukaryotes
Organisms from all three domains of life are scale-free networks! H. Jeong, B. Tombor, R. Albert, Z.N. Oltvai, and A.L. Barabasi, Nature, 2000

27 Topology of the protein network
H. Jeong, S.P. Mason, A.-L. Barabasi & Z.N. Oltvai, Nature, 2001

28 Nature (2000)

29 p53 network (mammals)

30 Local clustering Clustering: My friends will likely know each other!
Networks are clustered [large C]

31 Clustering Coefficient
The density of the network surrounding node I, characterized as the number of triangles through I. Related to network modularity k: neighbors of I nI: edges between node I’s neighbors The center node has 8 (grey) neighbors There are 4 edges between the neighbors C = 4 /((8*(8-1)) /2)= 4/28 = 1/7

32 Shortest-Path between nodes

33 Shortest-Path between nodes

34 Small-world Network Every node can be reached from every other by a small number of hops or steps High clustering coefficient and low mean-shortest path length Random graphs don’t necessarily have high clustering coefficients Social networks, the Internet, and biological networks all exhibit small-world network characteristics

35 Modularity in Cellular Networks
Hypothesis: Biological function are carried by discrete functional modules. Hartwell, L.-H., Hopfield, J. J., Leibler, S., & Murray, A. W., Nature, 1999. Traditional view of modularity:

36 Modular vs. Scale-free Topology
(b)

37 How do we know that metabolic networks are modular?
clustering coefficient is the same across metabolic networks in different species with the same substrate corresponding randomized scale free network: C(N) ~ N-0.75 (simulation, no analytical result) bacteria archaea (extreme-environment single cell organisms) eukaryotes (plants, animals, fungi, protists) scale free network of the same size

38 Real Networks Have a Hierarchical Topology
What does it mean? Many highly connected small clusters combine into few larger but less connected clusters even larger and even less connected clusters The degree of clustering follows:

39 2. Clustering coefficient 3. Clustering coefficient scales
Properties of hierarchical networks 1. Scale-free 2. Clustering coefficient independent of N 3. Clustering coefficient scales

40 Hierarchy in biological systems
Metabolic networks Protein networks

41 Can we identify the modules?
topological overlap J(i,j): # of nodes both i and j link to; +1 if there is a direct (i,j) link

42 Modules in the E. coli metabolism
E. Ravasz et al., Science, 2002

43 Fraction of removed nodes, f
Robustness Complex systems maintain their basic functions even under errors and failures (cell  mutations; Internet  router breakdowns) fc 1 Fraction of removed nodes, f S node failure

44 Robustness of scale-free networks
  3 : fc=1 (R. Cohen et. al., PRL, 2000) Failures Topological error tolerance 1 fc Attacks R. Albert et.al. Nature, 2000 S f 1

45 Attack Tolerance Max Cluster Size Path Length
Max Cluster size changes accoeding to removing nodes Random network is vulnerable for random attacks, power law net is not. Random attack vs targeted atack

46 - lethality and topological position -
Yeast protein network - lethality and topological position - Highly connected proteins are more essential (lethal)... H. Jeong, S.P. Mason, A.-L. Barabasi &Z.N. Oltvai, Nature, 2001

47 Network Motifs

48 Network motifs Comparable to electronic circuit types (i.e., logic gates) The notion of motif, widely used for sequence analysis, is generalizable to the level of networks. Network Motifs are defined as recurring patterns of interconnections found within networks at frequencies much higher than those found in randomized networks.

49 Random vs designed/evolved features
Large networks may contain information about design principles and/or evolution of the complex system Which features are there for a reason? Design principles (e.g. feed-forward loops) Constraints (e.g. the all nodes on the Internet must be connected to each other) Evolution, growth dynamics (e.g. network growth is mainly due to gene duplication) All proteins are probably interconnected thru at least one part of the DNA – protein – metabolite network

50 Network motifs Uri Alon et al : “Network Motifs: Simple building Blocks of Complex Networks”; Science, 2002. Different networks were found to have different motif abundances. The motifs reflect the underlying processes that generate each type of network.

51 Motifs in the network motif to be found graph
motif matches in the target graph

52 Detecting network motifs
There are three main tasks in detecting network motifs: (1) Generating an ensemble of proper random networks (2) Counting the subgraphs in the real network and in random networks (3) Search for graphs that appear disproportionately in one list vs. the other

53 All 3-node connected subgraphs
13 different isomorphic types of 3-node connected subgraph There are: node subgraphs, 9,364 5-node subgraphs, etc…… In order to detect network motifs, one needs to count the number of appearances of all types of n-node subgraphs in the network as well as in an ensemble of randomized networks. There are many isomorphic types of subgraphs with a given number of nodes

54 Motifs detected Two significant motifs appearing numerous times in non-homologous gene systems that perform diverse biological functions

55 S. Wuchty, Z. Oltvai & A.-L. Barabasi, Nature Genetics, 2003
Motifs II S. Wuchty, Z. Oltvai & A.-L. Barabasi, Nature Genetics, 2003

56 Probabilistic algorithm for subgraph sampling
The problem : Exhaustive subgraph enumeration complexity scales as # of subgraphs Exponential in subgraph size Infeasible for large networks with hubs Solution : An efficient sampling algorithm

57 Probabilistic algorithm for subgraph sampling
Instead of examining absolute subgraph counts we define subgraph concentration : Sampling algorithm :

58 Different probabilities of sampling different subgraphs

59 Weight of each sample corrects for its
sampling probability P=0.14 W=7 P=0.33 W=3 4 5 1 2 3 6 7

60 Rapid convergence to real concentration
Kashtan et. al. Bioinformatics 2004

61 Runtime almost independent of network size
Kashtan et. al. Bioinformatics 2004


Download ppt "Biological Networks Feng Luo."

Similar presentations


Ads by Google