Network analysis and applications Sushmita Roy BMI/CS 576 Dec 2 nd, 2014.

Slides:



Advertisements
Similar presentations
Complex Networks Advanced Computer Networks: Part1.
Advertisements

Network analysis Sushmita Roy BMI/CS 576
Scale Free Networks.
Social network partition Presenter: Xiaofei Cao Partick Berg.
Network biology Wang Jie Shanghai Institutes of Biological Sciences.
The Architecture of Complexity: Structure and Modularity in Cellular Networks Albert-László Barabási University of Notre Dame title.
Emergence of Scaling in Random Networks Albert-Laszlo Barabsi & Reka Albert.
Analysis and Modeling of Social Networks Foudalis Ilias.
The multi-layered organization of information in living systems
VL Netzwerke, WS 2007/08 Edda Klipp 1 Max Planck Institute Molecular Genetics Humboldt University Berlin Theoretical Biophysics Networks in Metabolism.
Synopsis of “Emergence of Scaling in Random Networks”* *Albert-Laszlo Barabasi and Reka Albert, Science, Vol 286, 15 October 1999 Presentation for ENGS.
Advanced Topics in Data Mining Special focus: Social Networks.
4. PREFERENTIAL ATTACHMENT The rich gets richer. Empirical evidences Many large networks are scale free The degree distribution has a power-law behavior.
1 Evolution of Networks Notes from Lectures of J.Mendes CNR, Pisa, Italy, December 2007 Eva Jaho Advanced Networking Research Group National and Kapodistrian.
Emergence of Scaling in Random Networks Barabasi & Albert Science, 1999 Routing map of the internet
Biological Networks Feng Luo.
Regulatory networks 10/29/07. Definition of a module Module here has broader meanings than before. A functional module is a discrete entity whose function.
Peer-to-Peer and Grid Computing Exercise Session 3 (TUD Student Use Only) ‏
Sedgewick & Wayne (2004); Chazelle (2005) Sedgewick & Wayne (2004); Chazelle (2005)
Modularity in Biological networks.  Hypothesis: Biological function are carried by discrete functional modules.  Hartwell, L.-H., Hopfield, J. J., Leibler,
Global topological properties of biological networks.
Advanced Topics in Data Mining Special focus: Social Networks.
Graph, Search Algorithms Ka-Lok Ng Department of Bioinformatics Asia University.
Graphs and Topology Yao Zhao. Background of Graph A graph is a pair G =(V,E) –Undirected graph and directed graph –Weighted graph and unweighted graph.
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Large-scale organization of metabolic networks Jeong et al. CS 466 Saurabh Sinha.
Optimization Based Modeling of Social Network Yong-Yeol Ahn, Hawoong Jeong.
Information Networks Power Laws and Network Models Lecture 3.
(Social) Networks Analysis III Prof. Dr. Daning Hu Department of Informatics University of Zurich Oct 16th, 2012.
Models and Algorithms for Complex Networks Networks and Measurements Lecture 3.
Biological Networks Lectures 6-7 : February 02, 2010 Graph Algorithms Review Global Network Properties Local Network Properties 1.
Author: M.E.J. Newman Presenter: Guoliang Liu Date:5/4/2012.
Clustering of protein networks: Graph theory and terminology Scale-free architecture Modularity Robustness Reading: Barabasi and Oltvai 2004, Milo et al.
Part 1: Biological Networks 1.Protein-protein interaction networks 2.Regulatory networks 3.Expression networks 4.Metabolic networks 5.… more biological.
Self-Similarity of Complex Networks Maksim Kitsak Advisor: H. Eugene Stanley Collaborators: Shlomo Havlin Gerald Paul Zhenhua Wu Yiping Chen Guanliang.
Social Network Analysis Prof. Dr. Daning Hu Department of Informatics University of Zurich Mar 5th, 2013.
Soon-Hyung Yook, Sungmin Lee, Yup Kim Kyung Hee University NSPCS 08 Unified centrality measure of complex networks: a dynamical approach to a topological.
Biological Networks & Network Evolution Eric Xing
Class 9: Barabasi-Albert Model-Part I
Lecture 10: Network models CS 765: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Network Evolution Statistics of Networks Comparing Networks Networks in Cellular Biology A. Metabolic Pathways B. Regulatory Networks C. Signaling Pathways.
LECTURE 2 1.Complex Network Models 2.Properties of Protein-Protein Interaction Networks.
Most of contents are provided by the website Network Models TJTSD66: Advanced Topics in Social Media (Social.
Introduction to biological molecular networks
Bioinformatics Center Institute for Chemical Research Kyoto University
1 CIS 4930/6930 – Recent Advances in Bioinformatics Spring 2014 Network models Tamer Kahveci.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
1 Lesson 12 Networks / Systems Biology. 2 Systems biology  Not only understanding components! 1.System structures: the network of gene interactions and.
Network applications Sushmita Roy BMI/CS 576 Dec 9 th, 2014.
Hierarchical Organization in Complex Networks by Ravasz and Barabasi İlhan Kaya Boğaziçi University.
Response network emerging from simple perturbation Seung-Woo Son Complex System and Statistical Physics Lab., Dept. Physics, KAIST, Daejeon , Korea.
Algorithms and Computational Biology Lab, Department of Computer Science and & Information Engineering, National Taiwan University, Taiwan Network Biology.
Comparative Network Analysis BMI/CS 776 Spring 2013 Colin Dewey
Cmpe 588- Modeling of Internet Emergence of Scale-Free Network with Chaotic Units Pulin Gong, Cees van Leeuwen by Oya Ünlü Instructor: Haluk Bingöl.
Graph clustering to detect network modules
Structures of Networks
Hierarchical Agglomerative Clustering on graphs
Hiroki Sayama NECSI Summer School 2008 Week 2: Complex Systems Modeling and Networks Network Models Hiroki Sayama
Biological networks CS 5263 Bioinformatics.
Community detection in graphs
Biological Networks Analysis Degree Distribution and Network Motifs
Social Network Analysis
Finding modules on graphs
Department of Computer Science University of York
Modelling Structure and Function in Complex Networks
Network Models Michael Goodrich Some slides adapted from:
Advanced Topics in Data Mining Special focus: Social Networks
Advanced Topics in Data Mining Special focus: Social Networks
Presentation transcript:

Network analysis and applications Sushmita Roy BMI/CS Dec 2 nd, 2014

Computational problems in networks Network reconstruction – Infer the structure and parameters of networks – We examined this problem in the context of “expression- based network inference” Network evaluation/analysis – Properties of networks Network applications – Prioritization of genes – Interpretation of gene sets – Identify new biological pathways Densely connected subnetworks

Using networks as tools for discovery So far we have considered problems in network inference Topological properties of networks can be informative Biological networks can also be used for numerous applications – Prioritization of genes – Identify new biological pathways Densely connected subnetworks

Network properties Degree distribution Average shortest path length Clustering coefficient Modularity Network motifs

Why should we care about network measures? From Barabasi and Oltvai 2004: “Probably the most important discovery of network theory was the realization that despite the remarkable diversity of networks in nature, their architecture is governed by a few simple principles that are common to most networks of major scientific and technological interest”

Node degree Undirected network – Degree, k: Number of neighbors of a node Directed network – In degree, k in : Number of incoming edges – Out degree, k out : Number of outgoing edges Directed Edge A B C D E F In degree of F is 4 Out degree of E is 0

Average degree Consider an undirected network with N nodes Let k i denote the degree of node i Average degree is

Degree distribution P(k) the probability that a node has k edges Different networks can have different degree distributions A fundamental property that can be used to characterize a network

Different degree distributions Poisson distribution – The mean is a good representation of k i of all nodes – Networks that have a Poisson degree distribution are called Erdos Renyi or random networks Power law distribution – Also called scale free – There is no “typical” node that captures the degree of nodes.

Poisson distribution A discrete distribution The Poisson is parameterized by which can be easily estimated by maximum likelihood k P(X=k)

Power law distribution Used to capture the degree distribution of most real networks Typical value of is between 2 and 3. MLE exists for but is more complicated – See Power-Law Distributions in Empirical Data. Clauset, Shalizi and Newman, 2009 for details P(k)

Erdos Renyi random graphs Dates back to 1960 due to two mathematicians Paul Erdos and Alfred Renyi. Provides a probabilistic model to generate a graph Starts with N nodes and connects two nodes with probability p Node degrees follow a Poisson distribution Tail falls off exponentially, suggesting that nodes with degrees different from the mean are very rare

Scale free networks Degree distribution is captured by a power law distribution There is no “typical” node that describes the degree of all other nodes Such networks are ubiquitous in nature

Poisson versus Scale free Barabasi & Oltvai 2004, Nature Genetics Review

Yeast protein interaction network is believed to be scale free “Whereas most proteins participate in only a few interactions, a few participate in dozens” Such high degree nodes are called hubs Barabasi & Oltvai 2004, Nature Genetics Review

Degree of a node is correlated to functional importance of a node Red nodes on deletion cause the organism to die Red nodes also among the most degree central Yeast protein-protein interaction network

Origin of scale free networks Scale free networks are ubiquitous is nature How do such networks form? Such networks are the result of two processes – a growth process where new nodes join the network over an extended period of time Think about how the internet has grown – Preferential attachment: new nodes tend to connect to nodes with many neighbors Rich get richer.

Growth and preferential attachment in scale free networks A new node (red) is more likely to connect to node 1 than 2

Path lengths The shortest path length between two nodes A and B: – The smallest number of edges that need to be traversed to get from A to B Mean path length is the average of all shortest path lengths Diameter of a graph is the longest of all shortest paths in the network

Scale-free networks tend to be ultra-small Two nodes on the network are connected by a small number of edges Average path length is log(log(N)), where N is the number of nodes in the network In a random network (Erdos Renyi network) the average path length is log(N)

Modularity in networks Modularity “refers to a group of physically or functionally linked nodes that work together to achieve a distinct function” -- Barabasi & Oltvai Two questions – Given a network is it modular? Modularity can be assessed using “Clustering coefficient” Modularity can also be assessed using the difference between the number of edges within and between a given grouping of nodes. – Given a network what are the modules in the network? Graph clustering

A modular network Module 1 Module 2 Module 3

Clustering coefficient Measure of transitivity in the network that asks – If A is connected to B, and B is connected to C, how often is A connected to C? Clustering coefficient C i for each node i is k i Degree of node i n i is the number of edges among neighbors of i Average clustering coefficient gives a measure of “modularity” of the network A B C ?

Clustering coefficient example A C B G D

Finding modules in a graph Given a graph find the modules – Modules are represented by densely connected subgraphs The graph can be partitioned into modules using “Graph clustering” – Hierarchical or flat clustering using a notion of similarity between nodes – Markov clustering algorithm – Spectral clustering – Girvan-Newman algorithm

Girvan-Newman algorithm General idea: “If two communities are joined by only a few inter-community edges, then all paths through the network from vertices in one community to vertices in the other must pass along one of those few edges.” Betweenness of an edge e is defined as the number of shortest paths that include e Edges that lie between communities tend to have high betweenness M. E. J. Newman and M. Girvan. Finding and evaluating community structure

Girvan-Newman algorithm Initialize – Compute betweenness for all edges Repeat until convergence criteria 1.Remove the edge with the highest betweenness 2.Recompute betweenness of affected edges Convergence criteria can be – No more edges – Desired modularity.

Evaluating the “modularity” of the clusters Given K groups of nodes, we can compute modularity (Q) also as – difference between within group (community) connections and expected connections within a group K : number of groups e ij : Fraction of total edges that link nodes in group i to group j

Zachary’s karate club study Each node is an individual and edges represent social interactions among individuals. The shape and colors represent different groups. Node grouping based on betweenness

Network motifs Network motifs are defined as small recurring subnetworks that occur much more than a randomized network A subgraph is called a network motif of a network if its occurrence in randomized networks is significantly less than the original network. Some motifs are associated to explain specific network dynamics Milo Science 2002

Network motifs of size three in a directed network

Finding network motifs Enumerating motifs – Subgraph enumeration Calculating the number of occurrences in randomized networks Milo 2002

Network motifs found in many complex networks The occurrence of the feedforward loop in both networks suggests a fundamental similarity in the design on these networks

Structural common motifs seen in the yeast regulatory network Lee et.al. 2002, Mangan & Alon, 2003 Auto-regulationMulti-componentFeed-forward loop Single InputMulti Input Regulatory Chain Feed-forward loops involved in speeding up in response of target gene

Summary of network analysis Given a network, its topology can be characterized using different measures – Degree distribution – Average path length – Clustering coefficient Degree distribution can be – Poisson – Power law Such networks are called scale free Network modularity – Clustering coefficient – Edge betweennness Network motifs – Overrepresentation of subgraphs of specific types