GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang.

Slides:



Advertisements
Similar presentations
CONCEPTUAL WEB-BASED FRAMEWORK IN AN INTERACTIVE VIRTUAL ENVIRONMENT FOR DISTANCE LEARNING Amal Oraifige, Graham Oakes, Anthony Felton, David Heesom, Kevin.
Advertisements

GoMiner: (Zeeberg et al., Genome Biology, March 2003) For Tour of GoMiner: Advance using forward arrow.
Genome organization Lesk, Ch 2 (Lesk, 2008). Genomes and proteomes Genome of a typical bacterium comes as a single DNA molecule of about 5 million characters.
Bioinformatics at WSU Matt Settles Bioinformatics Core Washington State University Wednesday, April 23, 2008 WSU Linux User Group (LUG)‏
Gene Expression Chapter 9.
Using Gene Ontology Models and Tests Mark Reimers, NCI.
Gene Expression And Regulation Bioinformatics January 11, 2006 D. A. McClellan
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
August 29, 2002InforMax Confidential1 Vector PathBlazer Product Overview.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Biological Interpretation of Microarray Data Helen Lockstone DTC Bioinformatics Course 9 th February 2010.
27803::Systems Biology1CBS, Department of Systems Biology Schedule for the Afternoon 13:00 – 13:30ChIP-chip lecture 13:30 – 14:30Exercise 14:30 – 14:45Break.
1 Predicting Gene Expression from Sequence Michael A. Beer and Saeed Tavazoie Cell 117, (16 April 2004)
Introduction to molecular networks Sushmita Roy BMI/CS 576 Nov 6 th, 2014.
Genetics: From Genes to Genomes
Bryan Heck Tong Ihn Lee et al Transcriptional Regulatory Networks in Saccharomyces cerevisiae.
What is adaptive web technology?  There is an increasingly large demand for software systems which are able to operate effectively in dynamic environments.
GCB/CIS 535 Microarray Topics John Tobias November 15 th, 2004.
Pathway analysis Daniel Hurley Pathway analysis: summary A popular buzzword… but what does it mean? A popular buzzword… but what does it mean? How do.
B IOMEDICAL T EXT M INING AND ITS A PPLICATION IN C ANCER R ESEARCH Henry Ikediego
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
>>> Korean BioInformation Center >>> KRIBB Korea Research institute of Bioscience and Biotechnology GS2PATH: Linking Gene Ontology and Pathways Jin Ok.
Genome of the week - Deinococcus radiodurans Highly resistant to DNA damage –Most radiation resistant organism known Multiple genetic elements –2 chromosomes,
BACKGROUND Have a gene involved in neurological disease, its function unclear Knockout is lethal, so… Designed a conditional knockout (cKO) mouse where.
Gene Expression and Cell Differentiation CSCOPE Unit: 08 Lesson: 01.
Špindlerův Mlýn, Czech Republic, SOFSEM Semantically-aided Data-aware Service Workflow Composition Ondrej Habala, Marek Paralič,
Epigenome 1. 2 Background: GWAS Genome-Wide Association Studies 3.
A systems biology approach to the identification and analysis of transcriptional regulatory networks in osteocytes Angela K. Dean, Stephen E. Harris, Jianhua.
EGAN: Exploratory Gene Association Networks by Jesse Paquette Biostatistics and Computational Biology Core Helen Diller Family Comprehensive Cancer Center.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
RNAseq analyses -- methods
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Regulation of Gene Expression Eukaryotes
Regulation of Gene Expression: An Overview  Transcriptional  Tissue-specific transcription factors  Direct binding of hormones, growth factors, etc.
A little about how DNA works David Sloane, MD Special Studies, HGSE Brigham and Women’s Hospital Harvard Medical School 2/10/2014David.
Finish up array applications Move on to proteomics Protein microarrays.
Visualization and analysis of microarray and gene ontology data with treemaps Eric H Baehrecke, Niem Dang, Ketan Babaria and Ben Shneiderman Presenter:
Verna Vu & Timothy Abreo
Copyright OpenHelix. No use or reproduction without express written consent1.
8.6 Gene Expression and Regulation TEKS 5C, 6C, 6D, 6E KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells.
Ontology based analyses methods ++ develop a grammar for making productions using mf, bp, cl: –derive a higher level grammar for next level of productions.
Annotator Interface Sharon Diskin GUS 3.0 Workshop June 18-21, 2002.
PROGNOCHIP-BASE, FORTH-ICS 1 PrognoChip-BASE: An Information System for the Management of Spotted DNA MicroArray Experiments Extension of BASE v
COMPUTATIONAL ANALYSIS OF MULTILEVEL OMICS DATA FOR THE ELUCIDATION OF MOLECULAR MECHANISMS OF CANCER Presented by Azeez Ayomide Fatai Supervisor: Junaid.
6.1 How Theories relate offspring – parent resemblance Georg Mendel ~1860 Molecular Genetics 20 th century
Eukaryotic Gene Prediction Rui Alves. How are eukaryotic genes different? DNA RNA Pol mRNA Ryb Protein.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
Prokaryotic cells turn genes on and off by controlling transcription.
This tutorial will describe how to navigate the section of Gramene that provides descriptions of alleles associated with morphological, developmental,
Stephanie J. Culler, Kevin G. Hoff, Christina D. Smolke
Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
No reference available
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
1/15-19/16 Starter: 1/15 What do you know about genes? 1/19 1/15-19/ Gene Expression and Cell Differentiation Practice/Application/Connection.
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Microarray: An Introduction
KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells. Chapter 11 – Gene Expression.
 What is MSA (Multiple Sequence Alignment)? What is it good for? How do I use it?  Software and algorithms The programs How they work? Which to use?
Who is smarter and does more tricks you or a bacteria? YouBacteria How does my DNA compare to a prokaryote? Show-off.
Detect alternative splicing
Pick a Gene Assignment 4 Requirements
What is an Ontology An ontology is a set of terms, relationships and definitions that capture the knowledge of a certain domain. (common ontology ≠ common.
Proteomics Informatics David Fenyő
Computational Discovery of miR-TF Regulatory Modules in Human Genome
Volume 88, Issue 5, Pages (December 2015)
Eukaryotic Gene Regulation
Presentation transcript:

GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang Liu, Alessandro Ferrucci, William Reinhold, and John Weinstein plus a lot of help from Rich Einstein and Mike Brenner of ExonHit

The World According to a Microarray: Genes are not Genes Genes are a Mixture of Splice Variants

Patterns of alternative splicing

The Ostrich Effect Tend to hide our head in the sand Treat microarray data as if a gene did not have multiple alternative splice forms But altered expression of one splice variant can be more important than altered expression of the “gene” > i.e., lumping together all splice forms in one monolithic measurement is bad to do

Motivation: The Problem In many disease states, differential expression of individual splice variants may be more relevant than differential expression of genes Traditional microarrays are not designed to permit elucidation of individual splice variants State-of-the-art microarrays are being developed to permit elucidation of individual splice variants A major limitation is that software tools are not available to exploit the potential information content of the state-of-the-art microarrays

Our Solution: Three Components Develop a database (EVDB) and web application (SpliceMiner) that maps probe sequences to known splice variants Enhance GoMiner with a mechanism to process splice variants Connect these two “ends” with the appropriate integration approach

Our Solution: Three Components Develop a database (EVDB) and web application (SpliceMiner) that maps probe sequences to known splice variants Enhance GoMiner with a mechanism to process splice variants Connect these two “ends” with the appropriate integration approach

SpliceMiner Home Page HGNC symbol chromosomal coordinates Remember these: used later in GoMiner “Tilde” mechanism!!

“Batch” is key to analysis of microarray results

Our Solution: Three Components Develop a database (EVDB) and web application (SpliceMiner) that maps probe sequences to known splice variants Enhance GoMiner with a mechanism to process splice variants Connect these two “ends” with the appropriate integration approach

GoMiner and High-Throughput GoMiner GoMiner organizes lists of 'interesting' genes (for example, under- and overexpressed genes from a microarray experiment) for biological interpretation in the context of the Gene Ontology High-Throughput GoMiner is an enhancement of GoMiner which efficiently performs the computationally-challenging task of automated batch processing of an arbitrary number of microarray experiments

GoMiner “Tilde” (“~”) Mechanism GoMiner traditionally dereplicates input files so that only one instance of a gene name is processed When multiple alternatively spliced forms are to be analyzed, however, dereplication would result in a loss of relevant information Consequently, we have added a new feature to GoMiner to retain full information about the alternative splice variants by replicating the input of each gene according to the number of alternative exons

Example of Tilde Mechanism As a specific example, suppose that a microarray platform contained probes that were unique for two different splice variants of BRCA1 Then the two splice variants would be designated as 'BRCA1~1' and 'BRCA1~2' The '~' tells GoMiner to treat these as different entries, rather than to de-replicate them, but to ignore the suffix when querying the GO database By this mechanism, all splice variants are counted when computing the Fisher exact p value

A Publication using Tilde Mechanism Study of “exon expression” regulated by Nova, a key neuronal splicing factor Reference: Nova regulates brain- specific splicing to shape the synapse, Ule et al., Nature Genetics 37, (2005)

GoMiner Detected Differences in Neurologically-Important GO Categories between Wild Type and Nova Knockouts

Significance of Nova paper First description of a regulatory module operating at the level of information content mediated by RNA exon usage Levels of Nova-regulated RNAs are unchanged in knockout versus wild-type brains: alternative exon usage as a means of modulating the quality of synaptic protein interactions Regulation of quality, not quantity

Our Solution: Three Components Develop a database (EVDB) and web application (SpliceMiner) that maps probe sequences to known splice variants Enhance GoMiner with a mechanism to process splice variants Connect these two “ends” with the appropriate integration approach

Generalization of the Tilde Mechanism A Previous slide noted that two splice variants could be designated as ‘BRCA1~1’ and ‘BRCA1~2’ But the suffix can be an arbitrary string that carries biological information, not just used as an ordinal index So we can use the output of SpliceMiner (HGNC symbol, GenBank accession, chromosomal coordinates) to construct a string of the correct form, with a suffix that is highly informative Using the output from SpliceMiner as the input to GoMiner will connect the two “ends” and permit splice variant-based GO categorization

Conclusions The new era of microarray research will demand analysis of differential expression of exons and transcripts, rather than genes We are developing resources to map probe sequences to exons and transcripts GoMiner can integrate this information with GOA to allow the molecular biologist to leverage both knowledgebases for enhanced analysis and interpretation of microarray data

Collaborators GBG: Ari Kahn Michael Ryan David Kane Hongfang Liu William Reinhold John Weinstein GMU: Curtis Jamison UMBC: Alessandro Ferrucci ExonHit: Rich Einstein Mike Brenner