Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hyun Seok Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University College of Medicine Lecture 13. Network Analysis MES7594-01.

Similar presentations


Presentation on theme: "Hyun Seok Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University College of Medicine Lecture 13. Network Analysis MES7594-01."— Presentation transcript:

1 Hyun Seok Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University College of Medicine Lecture 13. Network Analysis MES7594-01 Genome Informatics I (2015 Spring)

2 What is network? Network? A set of nodes and edges

3 What is network? Node (=vertex) describes an entity or unit Node

4 What is network? Edges describes relationship between the nodes (directional/undirectional) Edge

5 What is network? Hubs describes nodes connected with many other nodes Hubs

6 What is network? Modules describe group of nodes (n >= 3) that are directly inter- connected. It is not easy to find modules in the subway map since they were purposely avoided, but in biological networks modules are very common. Module?

7 Betweenness: Number of shortest paths going through a vertex or an edge

8 Networks in real world Internet Electric circuits Transportation: roads, railways, airlines Social networks: friendships, relationships Biological networks …

9 Introduction to biological network Goal of systems biology is to have systems level understanding of biological phenomena (e.g. disease). Not only individual components like genes, proteins etc., but also their interactions as well. Biological network is consisted of molecules and their physical/functional interactions. In a typical biological network, a node (vertex) represents a gene or a protein, an edge represents an interaction (PPI, GI). Biological networks are typical scale free network: presence of large hubs give the degree distribution a long tail. degree distribution follows a power law.

10 Networks Pathways capture only the “well understood” portion of bio logy. (highly biased) Networks cover less well understood relationships: – Genetic interactions – Physical interaction – Coexpression – GO term sharing – Adjacency in pathways

11 Representation of Biological Networks AB C D EF m1m1 m2m2 m3m3 Proteins Metabolites Metabolism Gene regulation Cell signaling PPIs 11

12 Type of biological networks Protein-protein interaction (PPI) network Transcriptional regulation network Metabolic network Cell signaling network Drug-target network In this class, we will focus on PPI network.

13 Protein-protein interaction network Node: protein Edge: physical interaction; e.g. IP/Masspsec, Y2H Protein complex: groups of proteins that interact with each other at the same time and place, forming a single multi-molecular machine. Functional modules: groups of proteins that participate in a particular cellular process while binding each other at a different time and place. Caveat: PPIs are regulated in spatial and temporal way. Current models cannot handle this level of complexity. Also, experimental methods to detect PPIs are not perfect. There are many false positive and negative edges. However, it still provides power to detect true positive subnetworks/complexes.

14 PPI databases Biological General Repository for Interaction Datasets (BioGRID) Human Protein Reference Database (HPRD) Munich Information Center for Protein Sequences (MIPS) Database of Interacting Proteins (DIP) Molecular Interactions Database (MINT) Comprehensive Resource of Mammalian Protein Complexes (CORUM) Biomolecular Interaction Network Database (BIND) Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) Molecular Interaction Database (IntAct) Mentha (curated physical interactions from other databases): 82020 proteins, 498154 interactions, 41791 publications

15 Typical Network Analysis Workflow Interaction network Disease subnetwork Extract mutated, overexpressed, underexpressed, amplified/deleted genes Disease “modules” Disease gene prediction Sample classification Hypothesis generation Apply clustering algorithms (MCODE etc.)

16 MCODE algorithm The most famous method to find highly interconnected subgraphs as molecular complexes or clusters in large PPI networks. Bader and Hogue, BMC Bioinformatics 2003

17 MCODE algorithm Step1 : Node scoring: assign higher scores to nodes whose immediate neighbors are more interconnected.

18 MCODE algorithm Step1 : Node scoring: assign higher scores to nodes whose immediate neighbors are more interconnected. 2E/V(V-1) 2*6/4*3 = 12/12 = 1 1 3 3

19 Another node example MCODE algorithm 20 0.4 8

20 MCODE algorithm Score other nodes in this manner 3 0.4 3 3 3 2 2 2 2 2

21 MCODE algorithm This step is repeated and then the clusters are filtered out that don’t contain at least K- core networks. 3 0.4 3 3 3 2 2 2 2 2 If node score cutoff = 0.2 threshold = 2.4 Seed node

22 MCODE algorithm 3 0.4 3 3 3 2 2 2 2 2 If node score cutoff = 0.2 threshold = 1.6 Seed node

23 MCODE algorithm Highest K < 2

24 MCODE algorithm Step 3: Post-processing: process haircut and fluff.

25 Subnetwork discovery algorithms and its application on cancer genomics Caveat of MCODE algorithm: - Cutoff based - Sensitive to outliers (a single outlier node can break down a cluster) -HotNet2 finds significantly mutated subnetworks using a diffusion process on a PPI network. -Each node is assigned a score (heat) according to the frequency and significance of SNVs or CNAs in the corresponding gene. -Heat diffuses across the edges of a network. -Subnetworks containing nodes that both send and receive a significant amount of heat are reported.

26

27 Cytoscape Apps

28 ClueGO Visualize the non- redundant biological terms for large clusters of genes in a functionally grouped network.

29 Warning! It sounds fancy (actually it is), but keep in mind that it handles scale free network which consumes big chunk of memory space and CPUs. They make system freeze very often. Therefore, keep saving your session after every analysis. If you have to reinstall Cytoscape, remember to remove the old ‘CytoscapeConfiguration’ directory in your home folder before installation.

30 Cytoscape lab workflow Install Cytoscape Download the newest version of mentha PPI database Install Apps: Apps -> App manager -> BinGo, MCODE, ClueGO (need a lisence key: KpnD-mQ@O-CIq&-M1S4-mH7M-E9@S-q]{t-@5[6) Import -> Network -> File -> mentha Select -> Nodes -> From ID List File -> C-A.DEGs.symbol.txt (5-aza-deoxy-cytidine) Select network -> right click -> Create view File-> new -> network -> From selected nodes, all edges Apps -> MCODE -> Open MCODE MCODE tab -> Find clusters [in whole network] -> Analyze current network Choose a subnetwork from the result -> Create Sub-Network -> Ctrl-A (select all nodes in the subnetwork) Apps-> BinGO -> type-in cluster name -> Organism: Homo Sapiens -> Start BinGO  BinGO tells you enriched functions in your subnetwork. Network-> Copy gene symbols (N=1001) -> Apps-> ClueGO-> paste genes into the marker panel-> View style: Significance-> ClueGO Settings: GO BP, KEGG, REACTOME, WikiPathway && Evidence: All-> Use GO Term Fusion-> Show only Pathways with pV Start SAVE YOUR SESSION!!!

31 Module 3 bioinformatics.ca

32

33 Further Reading Cline, et al. “Integra2on of biological networks and gene expression data using Cytoscape”, Nature Protocols, 2, 2366-2382 (2007).

34 Network Analysis Workflow


Download ppt "Hyun Seok Kim, Ph.D. Assistant Professor, Severance Biomedical Research Institute, Yonsei University College of Medicine Lecture 13. Network Analysis MES7594-01."

Similar presentations


Ads by Google