Lecture7 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of.

Slides:



Advertisements
Similar presentations
Social network partition Presenter: Xiaofei Cao Partick Berg.
Advertisements

Lecture 5 Graph Theory. Graphs Graphs are the most useful model with computer science such as logical design, formal languages, communication network,
Walks, Paths and Circuits Walks, Paths and Circuits Sanjay Jain, Lecturer, School of Computing.
Introduction to Network Theory: Modern Concepts, Algorithms
Clustering II CMPUT 466/551 Nilanjan Ray. Mean-shift Clustering Will show slides from:
GOLOMB RULERS AND GRACEFUL GRAPHS
Spectral Clustering Scatter plot of a 2D data set K-means ClusteringSpectral Clustering U. von Luxburg. A tutorial on spectral clustering. Technical report,
Graph & BFS.
Graph & BFS Lecture 22 COMP171 Fall Graph & BFS / Slide 2 Graphs * Extremely useful tool in modeling problems * Consist of: n Vertices n Edges D.
Segmentation Graph-Theoretic Clustering.
Centrality Measures These measure a nodes importance or prominence in the network. The more central a node is in a network the more significant it is to.
Chapter 9 Graph algorithms Lec 21 Dec 1, Sample Graph Problems Path problems. Connectedness problems. Spanning tree problems.
Chapter 3 Determinants and Matrices
Three Algorithms for Nonlinear Dimensionality Reduction Haixuan Yang Group Meeting Jan. 011, 2005.
Expanders Eliyahu Kiperwasser. What is it? Expanders are graphs with no small cuts. The later gives several unique traits to such graph, such as: – High.
Graphs G = (V,E) V is the vertex set. Vertices are also called nodes and points. E is the edge set. Each edge connects two different vertices. Edges are.
Using Social Networks to Analyze Sexual Relations by Amanda Dargie.
Systems Biology, April 25 th 2007Thomas Skøt Jensen Technical University of Denmark Networks and Network Topology Thomas Skøt Jensen Center for Biological.
K-Coloring k-coloring: A k-coloring of a graph G is a labeling f: V(G)  S, where |S|=k. The labels are colors; the vertices of one color form a color.
Systematic Analysis of Interactome: A New Trend in Bioinformatics KOCSEA Technical Symposium 2010 Young-Rae Cho, Ph.D. Assistant Professor Department of.
Network Measures Social Media Mining. 2 Measures and Metrics 2 Social Media Mining Network Measures Klout.
Lecture 4 Reaction system as ordinary differential equations Reaction system as stochastic process Metabolic network and stoichiometric matrix Graph spectral.
Graph Theory Chapter 6 from Johnsonbaugh Article(6.1, 6.2)
Lecture7 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Concept of Line Graphs Topic 3: Introduction.
Spectral coordinate of node u is its location in the k -dimensional spectral space: Spectral coordinates: The i ’th component of the spectral coordinate.
Biological Networks Lectures 6-7 : February 02, 2010 Graph Algorithms Review Global Network Properties Local Network Properties 1.
Graph Theory Topics to be covered:
TCP Traffic and Congestion Control in ATM Networks
Lecture 5: Mathematics of Networks (Cont) CS 790g: Complex Networks Slides are modified from Networks: Theory and Application by Lada Adamic.
Spectral Analysis based on the Adjacency Matrix of Network Data Leting Wu Fall 2009.
How to Analyse Social Network?
Data Structures & Algorithms Graphs
10. Lecture WS 2006/07Bioinformatics III1 V10: Network Flows V10 follows closely chapter 12.1 in on „Flows and Cuts in Networks and Chapter 12.2 on “Solving.
Chapter 1 Fundamental Concepts Introduction to Graph Theory Douglas B. West July 11, 2002.
10. Lecture WS 2014/15 Bioinformatics III1 V10 Metabolic networks - Graph connectivity Graph connectivity is related to analyzing biological networks for.
Topics Paths and Circuits (11.2) A B C D E F G.
Unit – V Graph theory. Representation of Graphs Graph G (V, E,  ) V Set of vertices ESet of edges  Function that assigns vertices {v, w} to each edge.
Lecture 3 1.Different centrality measures of nodes 2.Hierarchical Clustering 3.Line graphs.
Data Structures & Algorithms Graphs Richard Newman based on book by R. Sedgewick and slides by S. Sahni.
MAT 2720 Discrete Mathematics Section 8.2 Paths and Cycles
Spectral Graph Theory and the Inverse Eigenvalue Problem of a Graph Leslie Hogben Department of Mathematics, Iowa State University, Ames, IA 50011
Graph spectral analysis/
How to Analyse Social Network? Social networks can be represented by complex networks.
Community structure in graphs Santo Fortunato. More links “inside” than “outside” Graphs are “sparse” “Communities”
Informatics tools in network science
Chapter 9: Graphs.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
12. Lecture WS 2012/13Bioinformatics III1 V12 Menger’s theorem Borrowing terminology from operations research consider certain primal-dual pairs of optimization.
(CSC 102) Lecture 30 Discrete Structures. Graphs.
Spanning Trees Dijkstra (Unit 10) SOL: DM.2 Classwork worksheet Homework (day 70) Worksheet Quiz next block.
::Network Optimization:: Minimum Spanning Trees and Clustering Taufik Djatna, Dr.Eng. 1.
Fundamental Graph Theory (Lecture 1) Lectured by Hung-Lin Fu 傅 恆 霖 Department of Applied Mathematics National Chiao Tung University.
An Introduction to Graph Theory
Random Walk for Similarity Testing in Complex Networks
Social Networks Analysis
Graph theory Definitions Trees, cycles, directed graphs.
Network analysis.
Degree and Eigenvector Centrality
Graph Algorithm.
Segmentation Graph-Theoretic Clustering.
CS100: Discrete structures
Graph Operations And Representation
V11 Metabolic networks - Graph connectivity
3.3 Network-Centric Community Detection
Great Theoretical Ideas In Computer Science
V12 Menger’s theorem Borrowing terminology from operations research
V11 Metabolic networks - Graph connectivity
Graph Operations And Representation
V11 Metabolic networks - Graph connectivity
Presentation transcript:

Lecture7 Topic1: Graph spectral analysis/Graph spectral clustering and its application to metabolic networks Topic 2: Different centrality measures of nodes

Graph spectral analysis/ Graph spectral clustering

PROTEIN STRUCTURE: INSIGHTS FROM GRAPH THEORY by SARASWATHI VISHVESHWARA, K. V. BRINDA and N. KANNANy Molecular Biophysics Unit, Indian Institute of Science Bangalore , India

Laplacian matrix L=D-A Adjacency Matrix Degree Matrix

Eigenvalues of a matrix A are the roots of the following equation |A-λI|=0, where I is an identity matrix Let λ is an eigenvalue of A and x is a vector such that then x is an eigenvector of A corresponding to λ (1) N×NN×1 Eigenvalues and eigenvectors

Node 1 has 3 edges, nodes 2, 3 and 4 have 2 edges each and node 5 has only one edge. The magnitude of the vector components of the largest eigenvalue of the Adjacency matrix reflects this observation.

Node 1 has 3 edges, nodes 2, 3 and 4 have 2 edges each and node 5 has only one edge. Also the magnitude of the vector components of the largest eigenvalue of the Laplacian matrix reflects this observation.

 The largest eigenvalue (lev) depends upon the highest degree in the graph.  For any k regular graph G (a graph with k degree on all the vertices), the eigenvalue with the largest absolute value is k.  A corollary to this theorem is that the lev of a clique of n vertices is n − 1.  In a general connected graph, the lev is always less than or equal to (≤ ) to the largest degree in the graph.  In a graph with n vertices, the absolute value of lev decreases as the degree of vertices decreases.  The lev of a clique with 11 vertices is 10 and that of a linear chain with 11 vertices is a linear chain with 11 vertices

In graphs 5(a)-5(e), the highest degree is 6. In graphs 5(f)-5(i), the highest degree is 5, 4, 3 and 2 respectively.

It can be noticed that the lev is generally higher if the graph contains vertices of high degree. The lev decreases gradually from the graph with highest degree 6 to the one with highest degree 2. In case of graphs 5(a)-5(e), where there is one common vertex with degree 6 (highest degree) and the degrees of the other vertices are different (less than 6 in all cases) i.e. the lev also depends on the degree of the vertices adjoining the highest degree vertex.

This paper combines graph 4(a) and graph 4(b) and constructs a Laplacian matrix with edge weights (1/dij ), where dij is the distance between vertices i and j. The distances between the vertices of graph 4(a) and graph 4(b) are considered to be very large (say 100) and thus the matrix elements corresponding to a vertex from graph 4(a) and the other from graph 4(b) is considered to have a very small value of The Laplacian matrix of 8 vertices thus considered is diagonalized and their eigenvalues and corresponding vector components are given in Table 3.

The vector components corresponding to the second smallest eigenvalue contains the desired information about clustering, where the cluster forming residues have identical values. In Fig. 4, nodes 1-5 form a cluster (cluster 1) and 6-8 form another cluster (cluster 2).

Metabolome Based Reaction Graphs of M. tuberculosis and M. leprae: A Comparative Network Analysis by Ketki D. Verkhedkar1, Karthik Raman2, Nagasuma R. Chandra2, Saraswathi Vishveshwara1* 1 Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India, 2 Bioinformatics Centre, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India PLoS ONE | September 2007 | Issue 9 | e881

Construction of network R1 R2 R3 R4 Stoichrometric matrix Following this method the networks of metabolic reactions corresponding to 3 organisms were constructed

Analysis of network parameters

Giant component of the reaction network of e.coli

Giant components of the reaction networks of M. tuberculosis and M. leprae

Analyses of sub-clusters in the giant component Graph spectral analysis was performed to detect sub- clusters of reactions in the giant component. To obtain the eigenvalue spectra of the graph, the adjacency matrix of the graph is converted to a Laplacian matrix (L), by the equation: L=D-A where D, the degree matrix of the graph, is a diagonal matrix in which the ith element on the diagonal is equal to the number of connections that the ith node makes in the graph. It is observed that reactions belonging to fatty acid biosynthesis and the FAS-II cycle of the mycolic acid pathway in M. tuberculosis form distinct, tightly connected sub-clusters.

Identification of hubs in the reaction networks In biological networks, the hubs are thought to be functionally important and phylogenetically oldest. The largest vector component of the highest eigenvalue of the Laplacian matrix of the graph corresponds to the node with high degree as well as low eccentricity. Two parameters, degree and eccentricity, are involved in the identification of graph spectral (GS) hubs.

Identification of hubs in the reaction networks Alternatively, hubs can be ranked based on their connectivity alone (degree hubs). It was observed that the top 50 degree hubs in the reaction networks of the three organisms comprised reactions involving the metabolite L-glutamate as well as reactions involving pyruvate. However, the top 50 GS hubs of M. tuberculosis and M. leprae exclusively comprised reactions involving L-glutamate while the top GS hubs in E. coli only consisted of reactions involving pyruvate. The difference in the degree and GS hubs suggests that the most highly connected reactions are not necessarily the most central reactions in the metabolome of the organism

Centrality measures of nodes

Centrality measures Within graph theory and network analysis, there are various measures of the centrality of a vertex within a graph that determine the relative importance of a vertex within the graph. Degree centrality Betweenness centrality Closeness centrality Eigenvector centrality Subgraph centrality We will discuss on the following centrality measures:

Degree centrality Degree centrality is defined as the number of links incident upon a node i.e. the number of degree of the node Degree centrality is often interpreted in terms of the immediate risk of the node for catching whatever is flowing through the network (such as a virus, or some information). Degree centrality of the blue nodes are higher

Betweenness centrality The vertex betweenness centrality BC(v) of a vertex v is defined as follows: Here σ uw is the total number of shortest paths between node u and w and σ uw (v) is number of shortest paths between node u and w that pass node v Vertices that occur on many shortest paths between other vertices have higher betweenness than those that do not.

a d b f e c Betweenness centrality σ uw σ uw (v) σ uw /σ uw (v) (a,b) 10 0 (a,d) 111 (a,e) 111 (a,f) 111 (b,d) 111 (b,e) 111 (b,f) 111 (d,e) 100 (d,f) 100 (e,f) 100 Betweenness centrality of node c=6 Betweenness centrality of node a=0 Calculation for node c

Hue (from red=0 to blue=max) shows the node betweenness. Betweenness centrality Nodes of high betweenness centrality are important for transport. If they are blocked, transport becomes less efficient and on the other hand if their capacity is improved transport becomes more efficient. Using a similar concept edge betweenness is calculated. ness_centrality#betweenness

Closeness centrality The farness of a vortex is the sum of the shortest-path distance from the vertex to any other vertex in the graph. The reciprocal of farness is the closeness centrality (CC). Here, d(v,t) is the shortest distance between vertex v and vertex t Closeness centrality can be viewed as the efficiency of a vertex in spreading information to all other vertices

Eigenvector centrality Let A is the adjacency matrix of a graph and λ is the largest eigenvalue of A and x is the corresponding eigenvector then The i th component of the eigenvector x then gives the eigenvector centrality score of the i th node in the network. From (1) Therefore, for any node, the eigenvector centrality score be proportional to the sum of the scores of all nodes which are connected to it. Consequently, a node has high value of EC either if it is connected to many other nodes or if it is connected to others that themselves have high EC -----(1) N×NN×1 |A-λI|=0, where I is an identity matrix

Subgraph centrality the number of closed walks of length k starting and ending on vertex i in the network is given by the local spectral moments μ k (i), which are simply defined as the ith diagonal entry of the kth power of the adjacency matrix, A: Closed walks can be trivial or nontrivial and are directly related to the subgraphs of the network. Subgraph Centrality in Complex Networks, Physical Review E 71, (2005)

M = M uv = 1 if there is an edge between nodes u and v and 0 otherwise. Subgraph centrality Adjacency matrix

M 2 = (M 2 ) uv for u  v represents the number of common neighbor of the nodes u and v. local spectral moment Subgraph centrality

The subgraph centrality of the node i is given by Let λ be the main eigenvalue of the adjacency matrix A. It can be shown that Thus, the subgraph centrality of any vertex i is bounded above by Subgraph centrality

Table 2. Summary of results of eight real-world complex networks.