Presentation is loading. Please wait.

Presentation is loading. Please wait.

Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics.

Similar presentations


Presentation on theme: "Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics."— Presentation transcript:

1 Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics This Presentation Available at: http://pixelshelf.com/~justandy/f-snp.ppt

2 Outline  Incidental Findings and Disconnected Patient Cohorts  Disease Association Studies Using SNPs  How SNPs cause disease  Computationally predict affect of SNPs within introns, exons, and regulatory regions  The Future Is Now: SNPs, Personalized Medicine, and Translational Research

3 Incidental Findings and Disconnected Patient Cohorts  IF the central dogma of Biology is: “From DNA ->RNA ->Protein”  THEN where is the patient data for association studies? Very little patient data spanning DNA/RNA/ protein/phenotype across a single cohort Need to obtain “robust” sample sizes to avoid incidental findings due to multiple testing [1] [1] Isaac Kohane, Daniel Masys, and Russ Altman. "The Incidentalome: A Threat to Genomic Medicine" JAMA 296(2): 212-215. July 12, 2006.

4 Disease Association Studies Using SNPs  DNA sequencing technologies still very expensive  Stunningly few patients Minimal sequence coverage  Could change in time with Solexa/454  Even with solexa/454 there is a massive task of piecing together the results (often max sequence read shorter than single repeated gene)  Rate limiting step: Adoption rate of DNA sequencing  Use what is available in abundance! SNP chips  Abundance of SNP chips in public repos on many diseases Whole genome coverage 500k SNPs for $250

5 Disease Association Studies Using SNPs DNA to RNA to Protein  Associating DNA & RNA GEO alone well over 100k Gene Expression Arrays What if we could correlate SNPs affect on Gene Expression?  Associating DNA & Gene Product (protein) Countless public protein databases What if we could correlate SNPs affect on Protein Coding?  Association studies involving multiple genomic measurements What are the existing studies and models (HMMs/Bayes nets) that could be strengthened with evidence from SNP chips?

6 How SNPs cause disease  Intron  Likely no affect  Protein Coding  Missense Synonymous  Same Amino Acid Non Synonymous  Different Amino Acid Nonsense Premature STOP Splicing Regulation Incorrect final mRNA transcript Transcriptional Regulation Differential gene expression Post Translational Protein phosphorylation

7 So how do we measure all these affects of SNPs?

8 F-SNP : integrated approach 1.Classify SNP site using dbSNP Intron Coding Region Splice Site TF binding Site Post-Translational Site 2. Evaluate using the specialized algorithms/dbs Coding region (missense/nonsense mutations) Splice Site (intronic/exonic sites) TF binding Site (promoter/repressor/etc) Post-Translational Site (Phospho/Tyrosine/0-glycosylation) 3. “Majority Vote” across algorithms

9 F-SNP decision procedure for functional SNPs

10 F-SNP: User Interfaces & Data Download  Public Web Site  Federated Query = entire database cannot be downloaded  Currently: no SOAP (webservice) support no RSS support No source code available  However: Paper gives explicit instructions on how to reproduce the algorithm and construct the database using dbSNP, OMIM, etc.

11 “Large N Study” using F-SNP Functional Category# of Assessed SNPs# of Functional SNPs Protein Coding154,14066,899 Splicing Regulation73,0518,075 Transcriptional Regulation453,71078,296 Post Translation64,7364,477 Total559,322115,356

12

13 Evaluate Individual SNP (rs28897699)

14 SNP summary and Functional Predictions

15 SNP Primary Information (rs28897699) Locus Alleles Ancestral Allele Validation (if any) Region Link to References

16 F-SNP: Functional Predictions

17 F-SNP Prediction Detail: PolyPhen = benign affect on protein coding

18 F-SNP Prediction Detail: SNPs3D = deleterious to protein coding NCBI Gene Information Product breast cancer 1, early onset Other names,BRCA1,BRCAI,BRCC1,IRIS,PSCP,RNF53 NCBI Entrez Gene Summary: This gene encodes a nuclear phosphoprotein that plays a role in maintaining genomic stability and acts as a tumor suppressor. (…) Mutations in this gene are responsible for approximately 40% of inherited breast cancers and more than 80% of inherited breast and ovarian cancers. Alternative splicing plays a role in modulating the subcellularlocalization and physiological function of this gene. Many alternatively spliced transcript variants have been described for this gene but only some have had their full-length natures identified. (…)

19 F-SNP functional prediction on Protein Coding  2 votes benign, 1 deleterious, 1 nonsynonymous on Splicing Regulation  predicted functional impact (by majority vote)

20 Gene level view of BRCA1 Query by gene name = “BRCA1” Returns list of SNPs in BRCA1 Returns list of Cancers associated with BRCA1

21 Gene level view of BRCA1 our SNP has functional impact our SNP has neighboring functional SNPS

22 Disease Level View : Breast Cancer

23 Show all disease genes associated with breast cancer Denote if SNPs are present in those genes (5k up/downstream)

24 Recap of Disease Level View

25 The Future Is Now: SNPs, Personalized Medicine, and Translational Research SNP profiling becoming part of routine care [2] Increase # of clinically annotated SNP chips  Increase # of disease association studies using SNPs Increase in NIH focus on “translational research” that bridges routine care delivery with research efforts Genome Wide Association Studies (GWAS) that actually get funded [2] Kohane IS, Mandl KD, Taylor PL, Holm IA, Nigrin DJ, Kunkel “LM. Medicine. Reestablishing the researcher-patient compact.” Science. 2007 Nov 16;318(5853):1068.

26 F-SNP Summary  Incidental Findings and Disconnected Patient Cohorts  Central dogma of biology DNA->RNA-Protein, yet we lack cohort spans all measurements  Using limited sample size will inevitably lead to incidental outcomes  Disease Association Studies Using SNPs  Don’t wait for DNA sequencing to become widespread  SNPs are becoming an abundant resource and not going to disappear  How SNPs cause disease  Protein Coding  Splicing Regulation  Transcription Regulation  Post Translation  Computationally predict affect of SNPs within introns, exons, and regulatory regions  Multitude of existing SNP analysis tools and resources  F-SNP provides a single web based resource to mine SNP disease associations  Query and analysis by SNP, Gene, Disease  The role of SNPs in Personalized Medicine & and Translational Research


Download ppt "Presented by: Andrew McMurry Boston University Bioinformatics Children’s Hospital Informatics Program Harvard Medical School Center for BioMedical Informatics."

Similar presentations


Ads by Google