Presentation is loading. Please wait.

Presentation is loading. Please wait.

MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification.

Similar presentations


Presentation on theme: "MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification."— Presentation transcript:

1 MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts http://acgt.cs.tau.ac.il/matisse Igor Ulitsky and Ron Shamir Identification of Functional Modules using Network Topology and High-Throughput Data. BMC Systems Biology 1:8 (2007).

2 Microarray data analysis Input: expression levels of (all) genes in several conditions Analysis methods: Clustering (CLICK) Biclustering (SAMBA) Extraction of regulatory networks

3 Protein interaction network analysis Input: Network with nodes=proteins/genes edges=interactions Analysis methods: Global properties Motif content analysis Complex extraction Cross-species comparison

4 Integrated analysis Combined support for low quality data Joint visualization Statistics of known pathways Detection of “ hot spots ”

5 MATISSE Identify sets of genes (modules) that Have highly correlated expression patterns Induce connected subgraphs in the interaction network Interaction High Similarity

6 The Probabilistic Model Formulates module finding as a hypothesis testing problem The likelihood ratio is decomposed into pairwise weights Allows incorporation of gene-specific priors Parameters learned by an EM algorithm

7 Computational aspects Finding a single module without connectivity constraints Reduces to finding the heaviest subgraph (+/- edge weights) An NP-Hard problem Heuristics inspired by maximum density approximation algorithms

8 MATISSE workflow Seed generation Greedy optimization Significance filtering

9 Finding seeds Three seeding alternatives Building small seeds around single nodes: Best neighbors All neighbors Approximating the heaviest subgraph

10 Greedy optimization Simultaneous optimization of all the seeds The following steps are considered: Node addition Node removal Assignment change Module merge

11 Advantages of MATISSE No need for confidence estimation on individual measurements Works even when only a fraction of the genes have expression patterns Can handle any similarity data, not only expression Produces connected modules No need to specify the number of modules

12 Osmotic shock response of S. cerevisiae Network of 6,246 genes and 65,990 protein-protein and protein-DNA interactions 133 experimental conditions – response of perturbed strains to osmotic shock (O’Rourke and Herskowitz, 2004) 2,000 genes filtered based on variation criterion

13 GO and promoter analysis

14 Pheromone response subnetwork Back Front

15 Proteolysis subnetwork Back Front

16 Performance comparison % of modules % of modules with category enrichment at p< 10 -3

17 Performance comparison (2) % of annotations % annotations w enrichment at p<10 -3 in modules

18 Human cell cycle Constructed a network with 6,000 nodes, 25,000 edges HPRD BIND Y2H studies SPIKE HeLa cell cycle time series (Whitfield ’02) Produced subnetworks enriched with all the phases of the cell cycle

19 M phase subnetwork

20 Extensions of MATISSE CEZANNE Utilizes confidence-based networks Extracts subnetworks that are connected with high confidence and co-expressed Applied to 11 studies of gene expression in the blood Not yet implemented in the MATISSE application

21

22 Extensions of MATISSE DEGAS Utilizes case-control expression data Identifies disregulated pathways – areas in the network in which many genes are dysregulated in most of the cases Beta version implemented in the MATISSE software Ulitsky, Karp and Shamir RECOMB 2008

23 Difficulties with prior approaches In case-control data, gene pattern correlation can be due to diverse non-disease related factors Patients are different Genetic background Other diseases/confounding factors Disease grade Current methods assume that the same genes are dysregulated in all the patients A weaker assumption – a lot of dysregulated genes appear in the same dysregulated pathway www.hrphotocontest.com

24 HD down-regulated The pathway down-regulated in Huntington’s disease (HD) Enriched with: HD modifiers HD relevant genes Calcium signalling Huntingtin Clear outlier

25 Extensions of MATISSE Identification of modules correlated with external parameters Numerical parameters: Age, tumor grade etc. Logical parameters: Gender, tumor type Identifies subnetworks with genes that are both Correlated with the clinical parameter Correlated with one another

26 MATISSE tool capabilities MATISSE algorithm execution Dynamic subnetwork layout Customized node/edge highlighting Dynamic expression matrix viewer Module annotation TANGO – Gene Ontology Annotations with custom datasets Calculation of different coefficients based on network/expression


Download ppt "MATISSE - Modular Analysis for Topology of Interactions and Similarity SEts Igor Ulitsky and Ron Shamir Identification."

Similar presentations


Ads by Google