Presentation is loading. Please wait.

Presentation is loading. Please wait.

Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.

Similar presentations


Presentation on theme: "Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological."— Presentation transcript:

1 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological Sequence Analysis Technical University of Denmark

2 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Network Example - The Internet http://www.jeffkennedyassociates.com:16080/connections/concept/image.html

3 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Co-authorship at Max Planck http://www.jeffkennedyassociates.com:16080/connections/concept/image.html

4 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Network Measures Degree k i Degree distribution P(k) Mean path length Network Diameter Clustering Coefficient

5 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Paths: metabolic, signaling pathways Cliques: protein complexes Hubs: regulatory modules Subgraphs: maximally weighted Network Analysis

6 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Graphs Graph G=(V,E) is a set of vertices V and edges E A subgraph G’ of G is induced by some V’  V and E’  E Graph properties: –Connectivity (node degree, paths) –Cyclic vs. acyclic –Directed vs. undirected

7 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Sparse vs Dense G(V, E) where |V|=n, |E|=m the number of vertices and edges Graph is sparse if m~n Graph is dense if m~n 2 Complete graph when m=n 2

8 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Connected Components G(V,E) |V| = 69 |E| = 71

9 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Connected Components G(V,E) |V| = 69 |E| = 71 6 connected components

10 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Paths A path is a sequence {x 1, x 2,…, x n } such that (x 1,x 2 ), (x 2,x 3 ), …, (x n-1,x n ) are edges of the graph. A closed path x n =x 1 on a graph is called a graph cycle or circuit.

11 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Shortest-Path Between Nodes

12 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Shortest-Path Between Nodes

13 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Longest Shortest-Path

14 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Small-world Network Every node can be reached from every other by a small number of hops or steps High clustering coefficient and low mean-shortest path length –Random graphs don’t necessarily have high clustering coefficients Social networks, the Internet, and biological networks all exhibit small-world network characteristics

15 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark gene A gene B regulates gene A gene B binds gene A gene B reaction product is a substrate for regulatory interactions (protein-DNA) functional complex B is a substrate of A (protein-protein) metabolic pathways Network Representation

16 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Representation of Metabolic Reactions

17 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Network Measures: Degree

18 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark P(k) is probability of each degree k, i.e fraction of nodes having that degree. For random networks, P(k) is normally distributed. For real networks the distribution is often a power- law: P(k) ~ k  Such networks are said to be scale-free Degree Distribution

19 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Interconnected Regions: Modules

20 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark k: neighbors of I n I : edges between node I’s neighbors The density of the network surrounding node I, characterized as the number of triangles through I. Related to network modularity The center node has 8 (grey) neighbors There are 4 edges between the neighbors C = 2*4 /(8*(8-1)) = 8/56 = 1/7 Clustering Coefficient

21 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Hierarchical Networks

22 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Detecting Hierarchical Organization

23 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Knock-out Lethality and Connectivity

24 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Target the hubs to have an efficient safe sex education campaign Lewin Bo, et al., Sex i Sverige; Om sexuallivet i Sverige 1996, Folkhälsoinstitutet, 1998

25 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Scale-Free Networks are Robust Complex systems (cell, internet, social networks), are resilient to component failure Network topology plays an important role in this robustness –Even if ~80% of nodes fail, the remaining ~20% still maintain network connectivity Attack vulnerability if hubs are selectively targeted In yeast, only ~20% of proteins are lethal when deleted, and are 5 times more likely to have degree k>15 than k<5.

26 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Other Interesting Features Cellular networks are assortative, hubs tend not to interact directly with other hubs. Hubs tend to be “older” proteins (so far claimed for protein- protein interaction networks only) Hubs also seem to have more evolutionary pressure—their protein sequences are more conserved than average between species (shown in yeast vs. worm) Experimentally determined protein complexes tend to contain solely essential or non-essential proteins—further evidence for modularity.

27 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Summary: Network Measures Degree k i The number of edges involving node i Degree distribution P(k) The probability (frequency) of nodes of degree k Mean path length The avg. shortest path between all node pairs Network Diameter –i.e. the longest shortest path Clustering Coefficient –A high CC is found for modules

28 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Finding Overrepresented Motifs

29 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Metabolic and Transcription Factor Networks

30 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Overrepresented Motifs

31 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Identifying protein complexes in protein- protein interaction networks

32 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Identifying protein complexes from PPI data Barabasi & Oltvai, Nature Reviews, 2004 Identifying protein complexes from protein-protein interaction data require computational tools.

33 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark The MCODE algorithm The three steps of MCODE 1.Vertex weighting 2.Complex prediction 3.Post-processing Molecular Complex Detection MCODE

34 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Vertex (nodes) weighting Vertex weighting 1.Find neighbors

35 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Vertex (nodes) weighting Vertex weighting 1.Find neighbors 2.Get highest k-core graph K-core graph: A graph of minimal degree k, i.e. All nodes must have at least k connections

36 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Vertex (nodes) weighting Vertex weighting 1.Find neighbors 2.Get highest k-core graph 3.Calculate density of k- core graph Density: Number of observed edges, E, divided by the total number of possible edges, E max E max = V (V-1)/2 (networks without loops)

37 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Vertex (nodes) weighting Vertex weighting 1.Find neighbors 2.Get highest k-core graph 3.Calculate density of k- core graph 4.Calculate vertex (node) weight: Density * k max Density: Number of observed edges, E, divided by the total number of possible edges, E max E max = V (V-1)/2 (networks without loops)

38 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Molecular complex prediction Complex prediction 1.Seed complex by nodes with highest weight 2.Include neighbors if the vertex weight is above threshold (VWP) 3.Repeat step 2 until no more nodes can be included

39 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Post-processing Complex post-processing 1.Complexes must contain at least a 2- core graph 2.Include neighbors if the vertex weight is above the fluff parameter (optional) 3.Haircut: Remove nodes with a degree less than two (optional)

40 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Identifying active subgraphs

41 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Ideker T, Ozier O, Schwikowski B, Siegel AF. Discovering regulatory and signaling circuits in molecular interaction networks. Bioinformatics. 2002;18 Suppl 1:S233-40. Active Subgraphs Find high scoring subnetwork based on data integration

42 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Scoring a Sub-graph Ideker T, Ozier O, Schwikowski B, Siegel AF. Discovering regulatory and signaling circuits in molecular interaction networks. Bioinformatics. 2002;18 Suppl 1:S233-40.

43 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Significance Assessment of Active Module Ideker T, Ozier O, Schwikowski B, Siegel AF. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics. 2002;18 Suppl 1:S233-40. Score distributions for the 1st - 5th best scoring modules before (blue) and after (red) randomizing Z- scores (“states”). Randomization disrupts correlation between gene expression and network location.

44 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Finding “Active” Pathways in a Large Network is Hard Finding the highest scoring subnetwork is NP hard, so we use heuristic search algorithms to identify a collection of high-scoring subnetworks (local optima) Simulated annealing and/or greedy search starting from an initial subnetwork “seed” Considerations: Local topology, sub-network score significance (is score higher than would be expected at random?), multiple states (conditions)

45 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Summary Network measures –degree, network diameter, degree distributions, clustering coefficient Network modularity and robustness from hubs Analyzing networks –Finding motifs, identifying modules (complexes) Data integration –Finding active subnetworks

46 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark

47 Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark 5343 313 0.5 12/3 1/301 123 34/21 3


Download ppt "Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological."

Similar presentations


Ads by Google