Detecting topological patterns in protein networks Sergei Maslov Brookhaven National Laboratory.

Slides:



Advertisements
Similar presentations
Network analysis Sushmita Roy BMI/CS 576
Advertisements

Biological Networks Analysis Degree Distribution and Network Motifs
Network biology Wang Jie Shanghai Institutes of Biological Sciences.
The Architecture of Complexity: Structure and Modularity in Cellular Networks Albert-László Barabási University of Notre Dame title.
An Intro To Systems Biology: Design Principles of Biological Circuits Uri Alon Presented by: Sharon Harel.
The multi-layered organization of information in living systems
CSE Fall. Summary Goal: infer models of transcriptional regulation with annotated molecular interaction graphs The attributes in the model.
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
School of Information University of Michigan SI 614 Random graphs & power law networks preferential attachment Lecture 7 Instructor: Lada Adamic.
UC Davis, May 18 th 2006 Introduction to Biological Networks Eivind Almaas Microbial Systems Division.
Detecting topological patterns in complex networks Sergei Maslov Brookhaven National Laboratory.
Extracting hidden information from knowledge networks Sergei Maslov Brookhaven National Laboratory, New York, USA.
A Real-life Application of Barabasi’s Scale-Free Power-Law Presentation for ENGS 112 Doug Madory Wed, 1 JUN 05 Fri, 27 MAY 05.
Statistical physics of complex networks
Biological Networks Feng Luo.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
Protein Binding Networks: from Topology to Kinetics Sergei Maslov Brookhaven National Laboratory.
Sedgewick & Wayne (2004); Chazelle (2005) Sedgewick & Wayne (2004); Chazelle (2005)
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Global topological properties of biological networks.
Network Motifs: simple Building Blocks of Complex Networks R. Milo et. al. Science 298, 824 (2002) Y. Lahini.
BACKGROUND E. coli is a free living, gram negative bacterium which colonizes the lower gut of animals. Since it is a model organism, a lot of experimental.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Biological networks Construction and Analysis. Recap Gene regulatory networks –Transcription Factors: special proteins that function as “keys” to the.
Biological networks: Types and origin
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Comparative Expression Moran Yassour +=. Goal Build a multi-species gene-coexpression network Find functions of unknown genes Discover how the genes.
Protein Classification A comparison of function inference techniques.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Lecture 3 Protein binding networks. C. elegans PPI from Li et al. (Vidal’s lab), Science (2004) Genome-wide protein binding networks Nodes - proteins.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
九大数理集中講義 Comparison, Analysis, and Control of Biological Networks (3) Domain-Based Mathematical Models for Protein Evolution Tatsuya Akutsu Bioinformatics.
Network Analysis and Application Yao Fu
Interactions and more interactions
Network Biology Presentation by: Ansuman sahoo 10th semester
School of Information University of Michigan SI 614 Network subgraphs (motifs) Biological networks Lecture 11 Instructor: Lada Adamic.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Reconstructing gene networks Analysing the properties of gene networks Gene Networks Using gene expression data to reconstruct gene networks.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Network Evolution Statistics of Networks Comparing Networks Networks in Cellular Biology A. Metabolic Pathways B. Regulatory Networks C. Signaling Pathways.
Genome Biology and Biotechnology The next frontier: Systems biology Prof. M. Zabeau Department of Plant Systems Biology Flanders Interuniversity Institute.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
Introduction to biological molecular networks
Bioinformatics Center Institute for Chemical Research Kyoto University
Extracting information from complex networks From the metabolism to collaboration networks Roger Guimerà Department of Chemical and Biological Engineering.
Discovering functional interaction patterns in Protein-Protein Interactions Networks   Authors: Mehmet E Turnalp Tolga Can Presented By: Sandeep Kumar.
Constructing and Analyzing a Gene Regulatory Network Siobhan Brady UC Davis.
Robustness, clustering & evolutionary conservation Stefan Wuchty Center of Network Research Department of Physics University of Notre Dame title.
1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.
Network Analysis Goal: to turn a list of genes/proteins/metabolites into a network to capture insights about the biological system 1.Types of high-throughput.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Network Motifs See some examples of motifs and their functionality Discuss a study that showed how a miRNA also can be integrated into motifs Today’s plan.
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Cohesive Subgraph Computation over Large Graphs
Structures of Networks
Lecture 1: Introduction CS 765: Complex Networks
Biological networks CS 5263 Bioinformatics.
Applications of graph theory in complex systems research
Biological Networks Analysis Degree Distribution and Network Motifs
Building and Analyzing Genome-Wide Gene Disruption Networks
CSCI2950-C Lecture 13 Network Motifs; Network Integration
Presentation transcript:

Detecting topological patterns in protein networks Sergei Maslov Brookhaven National Laboratory

What defines a complex system? Complex system has many interacting components (10 11 neurons, 10 4 types of proteins, 10 6 routers, 10 9 web pages) All components are different from each other Systems traditionally studied by physics also have many interacting components (10 23 electrons in a superconductor) But they are all the same!

Networks in complex systems The simplest question about a complex system: who interacts with whom? The answer can be visualized as a network Network is the backbone of a complex system

Why study the topology of complex networks? Lots of easily available data: that’s where the state of the art information is (at least in biology) Large networks may contain information about basic design principles and/or evolutionary history of the complex system This is similar to paleontology: learning about an animal from its backbone

Complex networks are the right description when things are interconnected

Internet From Y. Tu, “How robust is the Internet?”, Nature 406, 353 (2000)

Hierarchy of bio-networks Metabolic network: production of necessary chemical compounds Binding network: enzymes bind to their substrates in a metabolic network and to other proteins to form complexes Regulatory network: turns on and off particular groups of proteins in response to signals HIGHER LEVELS: cell-to cell communication (e.g. neurons in a brain), food webs, social networks, etc.

Protein binding network

Transcription regulatory networks Prokaryotic bacterium: E. coli Single-celled eukaryote: S. cerevisiae

General properties Densely interconnected Not very modular: functional modules talk to each other Have many random features Few proteins (hubs) interact with a lot of neighbors: but most – with just one

in- and out-degree of nodes Out-degree K out =5 In-degree K in =2

How many transcriptional regulators are out there?

Fraction of transcriptional regulators in bacteria from Stover et al., Nature (2000)

From E. van Nimwegen, Trends in Genetics, 2003

Complexity of regulation grows with complexity of organism N R =N =number of edges N R /N= / increases with N grows with N In bacteria N R ~N 2 (Stover, et al. 2000) In eucaryots N R ~N 1.3 (van Nimwengen, 2002) Networks in more complex organisms are more interconnected then in simpler ones Life is not just a bunch of independent modules!

Complexity is manifested in K in distribution E. coli vs. S. cerevisiae vs. H. sapiens

Beyond degree distributions: How is it all wired together?

Central vs peripheral network architecture central (hierarchical) peripheral (anti-hierarchical) From A. Trusina, P. Minnhagen, SM, K. Sneppen, Phys. Rev. Lett. (2004) random

Correlation profile Count N(k 0,k 1 ) – the number of links between nodes with connectivities k 0 and k 1 Compare it to N r (k 0,k 1 ) – the same property in a random network Qualitative features are very noise- tolerant with respect to both false positives and false negatives

Correlation profile of the protein interaction network R(k 0,k 1 )=N(k 0,k 1 )/N r (k 0,k 1 ) Z(k 0,k 1 ) =(N(k 0,k 1 )-N r (k 0,k 1 ))/  N r (k 0,k 1 ) Similar profile is seen in the yeast regulatory network

Some scale-free networks may appear similar In both networks the degree distribution is scale-free P(k)~ k -  with  ~

But: correlation profiles give them unique identities InternetProtein interactions

How to construct a proper random network?

Null-model of a network Distribution of degrees is non-random: the degree of every node has to be conserved in a random network Other topological properties may be also conserved as well: The extent of modularity (by function, sub-cellular localization, etc.) Small motifs (e.g feed-forward loops)

Randomization given complex network random

Edge swapping (rewiring) algorithm Randomly select and rewire two edges Repeat many times SM, K. Sneppen, Science (2002)

Metropolis rewiring algorithm Randomly select two edges Calculate change  E in “energy function” E=(N actual -N desired ) 2 /N desired Rewire with probability p=exp(-  E/T) “energy” E “energy” E+  E SM, K. Sneppen: preprint (2002), Physica A (2004)

How do protein networks evolve?

Gene duplication Pair of duplicated proteins Shared interactions Pair of duplicated proteins Shared interactions Right after duplicationAfter some time

Yeast regulatory network SM, K. Sneppen, K. Eriksen, K-K. Yan 2003

100 million years ago

Network properties of self-binding proteins AKA homodimers

There are just TOO MANY homodimers Null-model P self ~ /N N dimer =N  P self = Not surprising as homodimers have many functional roles

Network properties around homodimers

Fly: two-hybrid dataHuman: database data Likelihood to self-interact vs. K P self ~0.05, P others ~0.0002P self ~0.003, P others ~0.0002

What we think it means? In random networks p dimer (K)~K 2 not ~K like our empirical observation K is proportional to the “stickiness” of the protein which in its turn scales with the area of hydrophobic residues on the surface # copies/cell its’ popularity (in datasets taken from databases) etc. Real interacting pair consists of an “active” and “passive” protein and binding probability scales only with the “stickiness” of the active protein “Stickiness” fully accounts for higher than average connectivity of homodimers

Summary Living cells contain many complex protein networks Networks in more complex organisms are more interconnected Most have hubs – highly connected proteins Hubs often avoid each other (networks are anti- hierarchical) Networks evolve by gene duplications There are many self-interacting proteins. Probability to self-interact linearly scales with the degree K.

Collaborators: Kim Sneppen – U. of Copenhagen Kasper Eriksen – U. of Lund Koon-Kiu Yan – Stony Brook Ilya Mazo, Jaroslav Ispolatov, Anton Yuryev – Ariadne Genomics

THE END

Protective effect of duplicates Gu, et al 2003 Maslov, Sneppen, Eriksen, Yan 2003 YeastWorm Maslov, Sneppen, Eriksen, Yan 2003

Protein interaction networks SM, K. Sneppen, K. Eriksen, K-K. Yan 2003

What shapes the topology of protein networks? Duplication-divergence models CAN account for their basic topological features BUT: functional organization dominates: Most pairs with many shared interactions are NOT homologs Hubs are not caused by mult. duplications Functional organization is more important than duplication-divergence

Genome-wide protein networks Nodes - proteins Edges – interactions between proteins Bindings (physical interactions) Regulations (transcriptional, protein modifications, etc.) Etc, etc, etc.

Correlation profile of the yeast regulatory network R(k out, k in )=N(k out, k in )/N r (k out,k in )Z(k out,k in )=(N(k out,k in )-N r (k out,k in ))/  N r (k out,k in )

YPD full regulatory network O <-2 standard deviations O > 2 standard deviations

Regulators: Workhorse:

Amplification ratios A (dir) : E. Coli, Yeast A (undir) : E. Coli, 13.4 – Yeast A (PPI) : ? - E. Coli, Yeast

Two-hybrid experiment To test if A interacts with B create two hybrids A* (with Gal4p DNA-binding domain) and B* (with Gal4p activation domain) High-throughput: all pairs among 6300 yeast proteins are tested: Uetz, et al. Nature (2000), Ito, et al., PNAS (2001). Gal4-activated reporter gene, say GAL2::ADE2Gal4-binding domain A*B*

DIP core correlation profile

Protein binding networks Two-hybrid nuclearDatabase (DIP) core set S. cerevisiae

SM, K. Sneppen, K. Eriksen, K-K. Yan 2003

Pathway vs. network paradigm Pathway Network

How to deal with feedback loops?

Propagation of signals often involves feedback loops Feedback could be used as a checkpoint that the signal reached its destination: The main direction of information flow A  B  C “Weak” feedback signal C  A Some closed loops like A  B  C  A are not used to send feedback but to regulate periodic processes: Cell cycle Circadian rhythms

Finding the feedback links Feedback links go against the main direction of the information flow Feedback edge

Algorithm Goal: remove the smallest number of edges to make the network feedback-free Solution: Randomly assign hierarchy weights H i to all nodes. Links i  j such that H i <H j are candidate feedback links Try to switch H m and H n on randomly selected nodes m,n. Accept if this switch reduces the number of feedback links. If increases – accept with a small probability (a la Metropolis) The final network obtained by removing feedback links is feedback-free

Bow-tie diagram SCC: 93 nodes In: 96 nodes wormholes dangling ends and isolated components Out: 553 nodes Human protein modification network: 1100 nodes

Feedback loops in human protein-modification network Removal of only 48 out of 2200 edges makes human protein modification network feedback- free Most links are “reproducible”: rerunning experiment gives almost the same answer Run algorithm 10 times. Each time record all upstream-downstream relations. Look for reproducible relationships: the correspond to main direction of the information flow

MAPK signalingInhibition of apoptosis

Gene disruptions in yeast

Repercussions of gene deletions Number of proteins affected by a single gene deletion Total number of proteins in yeast genome

Data from Ariadne Genomics Homo sapiens Total: 120,000 interacting protein pairs extracted from PubMed as of 8/2004