Variant Prioritization in Disease Studies
1. Remove common SNPs Credit: goldenhelix.com
2. Remove common exonic SNPs (from large WES studies) Credit: goldenhelix.com
3. Select amino acid changing variants Credit: goldenhelix.com
4. Find variants that have potential functional effect Credit: goldenhelix.com
Predicting functional effects Conservation across species Amino acid properties Protein structure Transmembrane regions, signal peptides etc.
Annovar: Annotate variants with all these annotations Web version:
Web ANNOVAR - basic
Web ANNOVAR - Advanced
Most times that’s still not enough to find a causative variant If the list is still too large and/or no obvious candidate variant stands out… Then we go ‘digging’ in the context of existing knowledge about genes, their product functions and known involvement in phenotypes and diseases…
Typical questions bioinformaticists ask (or should): Is the variant in a known disease gene? Is it in a gene involved in a related disease? Does the gene have a function that coincides with the pathology, biochemistry, etc? Is the gene product in a pathway associated with the disease? But that’s a lot of literature and databases to sift through!
Semantic database for disease genomics (my personal project) The dream: ASSIMILATE millions of biomedical and genetic facts and their inter-relations into a database in a way a biologist thinks about them Enable simultaneous querying across those facts from multiple knowledge domains in the way a biologist would Report relevant results along with their meaning
The B.O.R.G. (BioOntological Relationship Graph) “Resistance is futile, your data will be assimilated…”
Disease-specific semantic model For guilt by association or indirect association
Known disease gene
Potential novel disease gene
THE BORG “Resistance is futile… You WILL be assimilated…”
THANK YOU (Please fill out the feedback form) Assessment reports due 22 March