Variant Prioritization in Disease Studies. 1. Remove common SNPs Credit: goldenhelix.com.

Slides:



Advertisements
Similar presentations
Data analytics for better patient genetics
Advertisements

LS-SNP: Large-scale annotation of coding non- synonymous SNPs based on multiple information sources -Bioinformatics April 2005.
Bioinformatics (and Systems Biology?) in Biomedical Research Donald Dunbar Systems Biology Club 30th November 2005.
RDB2RDF: Incorporating Domain Semantics in Structured Data Satya S. Sahoo Kno.e.sis CenterKno.e.sis Center, Computer Science and Engineering Department,
1 Orthologs: Two genes, each from a different species, that descended from a single common ancestral gene Paralogs: Two or more genes, often thought of.
Pathways & Networks analysis COST Functional Modeling Workshop April, Helsinki.
1 Single Nucleotide Polymorphisms (SNP) Gary Jones SPE, Technology Center 1600 (703)
Yeast - why it simply has a lot to say about human disease.
Outline to SNP bioinformatics lecture
©CMBI 2005 Exploring Protein Sequences - Part 2 Part 1: Patterns and Motifs Profiles Hydropathy Plots Transmembrane helices Antigenic Prediction Signal.
Computational Molecular Biology (Spring’03) Chitta Baral Professor of Computer Science & Engg.
Bioinformatics: a Multidisciplinary Challenge Ron Y. Pinter Dept. of Computer Science Technion March 12, 2003.
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop.
Future Trends: Translational Informatics James J. Cimino Chief, Laboratory for Informatics Development Mark O. Hatfield Clinical Research Center National.
PolyPhen and SIFT: Tools for predicting functional effects of SNPs Epi 244 Spring 2009 Sam S. Oh.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.
DEMO CSE fall. What is GeneMANIA GeneMANIA finds other genes that are related to a set of input genes, using a very large set of functional.
Presented by Karen Xu. Introduction Cancer is commonly referred to as the “disease of the genes” Cancer may be favored by genetic predisposition, but.
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Identifying deleterious Single Nucleotide Polymorphisms using multiple sequence alignments CMSC858P Project by Maya Zuhl.
Knowledgebase Creation & Systems Biology: A new prospect in discovery informatics S.Shriram, Siri Technologies (Cytogenomics), Bangalore S.Shriram, Siri.
Automated Explanation of Gene-Gene Relationships Wacek Kuśnierczyk.
Networks and Interactions Boo Virk v1.0.
Introduction to Bioinformatics Spring 2002 Adapted from Irit Orr Course at WIS.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Helping scientists collaborate BioCAD. ©2003 All Rights Reserved.
The medical relevance of genome variability Gabor T. Marth, D.Sc. Department of Biology, Boston College Medical Genomics Course – Debrecen,
Streptococcus pneumoniae pathogenesis
Aims and objectives of the workshop David Moore. Aims Classification of variants is subjective and NEQAS results suggest this is not a major problem To.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
CANDID: A candidate gene identification tool Part 2 Janna Hutz March 26, 2007.
Condor: BLAST Monday, July 19 th, 3:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
DAVID R. SMITH DR. MARY DOLAN DR. JUDITH BLAKE Integrating the Cell Cycle Ontology with the Mouse Genome Database.
Condor: BLAST Rob Quick Open Science Grid Indiana University.
Gene Clustering by Latent Semantic Indexing of MEDLINE Abstracts Ramin Homayouni, Kevin Heinrich, Lai Wei, and Michael W. Berry University of Tennessee.
Using Exons to Define Isoforms in PRO Timothy Danford Novartis Institutes for Biomedical Research PRO / AlzForum Kickoff Meeting Oct. 4, 2011.
Biological Networks & Systems Anne R. Haake Rhys Price Jones.
By: Amira Djebbari and John Quackenbush BMC Systems Biology 2008, 2: 57 Presented by: Garron Wright April 20, 2009 CSCE 582.
Bioinformatics and Computational Biology
You can request PRO terms by using the SourceForge PRO tracker (Fig 3A) or by directly contributing to PRO by providing the information in the RACE-PRO.
Condor: BLAST Monday, 3:30pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
Proposed Research Problem Solving Environment for T. cruzi Intuitive querying of multiple sets of heterogeneous databases Formulate scientific workflows.
© 2012 Genomatix GeneGrid finding disease causing variants in NGS data Claudia Gugenmus Genomatix Software GmbH Bayerstrasse 85a
Student Blogging in Clinical Bioinformatics Mark Hoffman, Ph.D. Asst. Dean Educational Innovation – UMKC School of Medicine Director UMKC Center for Health.
What is BLAST? Basic BLAST search What is BLAST?
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
NCBI: something old, something new. What is NCBI? Create automated systems for knowledge about molecular biology, biochemistry, and genetics. Perform.
Research proposal 2009 信息技术会议 Bioinformatics Analysis & Identification of non-Synonymous SNPs in Candidate Genes for Ascites College of Animal Husbandry.
RDF based on Integration of Pathway Database and Gene Ontology SNU OOPSLA LAB DongHyuk Im.
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
What is BLAST? Basic BLAST search What is BLAST?
Interpreting exomes and genomes: a beginner’s guide
Networks and Interactions
Quiz#3 LC710 9/29/10 name____________
Sequence based searches:
Have (y)Our Protein Explained
Functional Annotation of the Horse Genome
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Strategies for annotation of a genome
Annotation of Sequence Variants in Cancer Samples
VWF sequence variants: innocent until proven guilty
Annotation of Sequence Variants in Cancer Samples
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Group A1 Caroline Kissel, Meg Sabourin, Kaylee Isaacs, Alex Maeder
The Genetic Basis for Cancer Treatment Decisions
Nic’s genome contains 16,124 variants,
Where would you draw the polyA tail in the gene above?______________
The Drosophila pipeline for modeling human disease.
Condor: BLAST Tuesday, Dec 7th, 10:45am
Presentation transcript:

Variant Prioritization in Disease Studies

1. Remove common SNPs Credit: goldenhelix.com

2. Remove common exonic SNPs (from large WES studies) Credit: goldenhelix.com

3. Select amino acid changing variants Credit: goldenhelix.com

4. Find variants that have potential functional effect Credit: goldenhelix.com

Predicting functional effects Conservation across species Amino acid properties Protein structure Transmembrane regions, signal peptides etc.

Annovar: Annotate variants with all these annotations Web version:

Web ANNOVAR - basic

Web ANNOVAR - Advanced

Most times that’s still not enough to find a causative variant If the list is still too large and/or no obvious candidate variant stands out… Then we go ‘digging’ in the context of existing knowledge about genes, their product functions and known involvement in phenotypes and diseases…

Typical questions bioinformaticists ask (or should): Is the variant in a known disease gene? Is it in a gene involved in a related disease? Does the gene have a function that coincides with the pathology, biochemistry, etc? Is the gene product in a pathway associated with the disease? But that’s a lot of literature and databases to sift through!

Semantic database for disease genomics (my personal project) The dream: ASSIMILATE millions of biomedical and genetic facts and their inter-relations into a database in a way a biologist thinks about them Enable simultaneous querying across those facts from multiple knowledge domains in the way a biologist would Report relevant results along with their meaning

The B.O.R.G. (BioOntological Relationship Graph) “Resistance is futile, your data will be assimilated…”

Disease-specific semantic model For guilt by association or indirect association

Known disease gene

Potential novel disease gene

THE BORG “Resistance is futile… You WILL be assimilated…”

THANK YOU (Please fill out the feedback form) Assessment reports due 22 March