1 of 25 Sequence Variation in Ensembl. 2 of 25 Outline SNPs SNPs in Ensembl Linkage disequilibrium SNPs in BioMart DAS sources.

Slides:



Advertisements
Similar presentations
CZ5225 Methods in Computational Biology Lecture 9: Pharmacogenetics and individual variation of drug response CZ5225 Methods in Computational Biology.
Advertisements

SNP Applications statwww.epfl.ch/davison/teaching/Microarrays/snp.ppt.
Introduction to genomes & genome browsers
Variation and Functional Genomics. 2 of 51 Overview of Talk SNPs and InDels Larger structural variants (CNVs) Phenotype data Individual genomes HapMap.
Fatchiyah, PhD Dept Biology UB Fatchiyah.lecture.ub.ac.id
Single Nucleotide Polymorphisms Jennifer Lyon Eskind Biomedical Library May 1, 2009 CRC Workshop Series.
Outline to SNP bioinformatics lecture
SNP database 張學偉 助理教授 高雄醫學大學 生物醫學暨環境生物學系. SNP = Single Nucleotide Polymorphism (read in SNiP)
SNP database 張學偉 助理教授 高雄醫學大學 生物醫學暨環境生物學系. SNP = Single Nucleotide Polymorphism (read in SNiP)
Introduction to Linkage Analysis March Stages of Genetic Mapping Are there genes influencing this trait? Epidemiological studies Where are those.
SNP Resources: Finding SNPs, Databases and Data Extraction Debbie Nickerson NIEHS SNPs Workshop.
How to access genomic information using Ensembl August 2005.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD Robert J. Livingston, PhD NIEHS Variation Workshop January 30-31, 2005.
Something related to genetics? Dr. Lars Eijssen. Bioinformatics to understand studies in genomics – São Paulo – June Image:
Genomewide Association Studies.  1. History –Linkage vs. Association –Power/Sample Size  2. Human Genetic Variation: SNPs  3. Direct vs. Indirect Association.
Polymorphisms – SNP, InDel, Transposon BMI/IBGP 730 Victor Jin, Ph.D. (Slides from Dr. Kun Huang) Department of Biomedical Informatics Ohio State University.
SNP Resources: Finding SNPs Databases and Data Extraction Mark J. Rieder, PhD SeattleSNPs Variation Workshop March 20-21, 2006.
Selecting TagSNPs in Candidate Genes for Genetic Association Studies Shehnaz K. Hussain, PhD, ScM Assistant Professor Department of Epidemiology, UCLA.
Genome Variations & GWAS
Computational Molecular Biology Biochem 218 – BioMedical Informatics Simple Nucleotide.
Introduction Basic Genetic Mechanisms Eukaryotic Gene Regulation The Human Genome Project Test 1 Genome I - Genes Genome II – Repetitive DNA Genome III.
- any detectable change in DNA sequence eg. errors in DNA replication/repair - inherited ones of interest in evolutionary studies Deleterious - will be.
Genetic Variations Lakshmi K Matukumalli. Human – Mouse Comparison.
Doug Brutlag 2011 Genomics & Medicine Doug Brutlag Professor Emeritus of Biochemistry &
SNPs Daniel Fernandez Alejandro Quiroz Zárate. A SNP is defined as a single base change in a DNA sequence that occurs in a significant proportion (more.
National Taiwan University Department of Computer Science and Information Engineering Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Gene Mutations Higher Human Biology Unit 1 – Human Cells.
SNP Haplotypes as Diagnostic Markers Shrish Tiwari CCMB, Hyderabad.
Gene Hunting: Linkage and Association
National Taiwan University Department of Computer Science and Information Engineering Pattern Identification in a Haplotype Block * Kun-Mao Chao Department.
1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
10cM - Linkage Mapping Set v2 ABI Median intermarker distance: 4.7 Mb Mean intermarker distance: 5.6 Mb Mean genetic gap distance: 8.9 cM Average Heterozygosity.
Identification of Copy Number Variants using Genome Graphs
ABC for the AEA Basic biological concepts for genetic epidemiology Martin Kennedy Department of Pathology Christchurch School of Medicine.
1 of 42 Browsing Genes and Genomes with Ensembl Maria Wilbe Department of Animal Breeding and Genetics, SLU, Sweden
Lecture 6. Functional Genomics: DNA microarrays and re-sequencing individual genomes by hybridization.
MEME homework: probability of finding GAGTCA at a given position in the yeast genome, based on a background model of A = 0.3, T = 0.3, G = 0.2, C = 0.2.
GVS: Genome Variation Server Materials prepared by: Warren C. Lathe, PhD Updated: Q Version 2.
Class 22 DNA Polymorphisms Based on Chapter 10 Recombinant DNA Technology Copyright © 2010 Pearson Education Inc.
February 20, 2002 UD, Newark, DE SNPs, Haplotypes, Alleles.
In The Name of GOD Genetic Polymorphism M.Dianatpour MLD,PHD.
Single nucleotide polymorphisms and Large scale variation
Copyright OpenHelix. No use or reproduction without express written consent1.
Genomics Chapter 18.
Linkage Disequilibrium and Recent Studies of Haplotypes and SNPs
Chapter 2 Genetic Variations. Introduction The human genome contains variations in base sequence from one individual to another. Some sequence variants.
Genetics of Gene Expression BIOS Statistics for Systems Biology Spring 2008.
Notes: Human Genome (Right side page)
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Global Variation in Copy Number in the Human Genome Speaker: Yao-Ting Huang Nature, Genome Research, Genome Research, 2006.
Name the 4 gene mutations that can occur State the effect of gene mutations on amino acid sequences.
Genome-Wides Association Studies (GWAS) Veryan Codd.
Lecture/Lab 7.31
Gene sequencing Analysis
Consideration for Planning a Candidate Gene Association Study With TagSNPs Shehnaz K. Hussain, PhD, ScM Epidemiology 243: Molecular.
DNA Marker Lecture 10 BY Ms. Shumaila Azam
Types of Mutations.
School of Pharmacy, University of Nizwa
Gene Hunting: Design and statistics
Linking Genetic Variation to Important Phenotypes
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
DNA and the Genome Key Area 6a & b Mutations.
School of Pharmacy, University of Nizwa
DNA and the Genome Key Area 6a & b Mutations.
BF528 - Whole Genome Sequencing and Genomic Variation
Copyright Pearson Prentice Hall
Unit 1 Human Cells Higher Human Biology for CfE Miss Aitken
Haplotype Inference Yao-Ting Huang Kun-Mao Chao.
Presentation transcript:

1 of 25 Sequence Variation in Ensembl

2 of 25 Outline SNPs SNPs in Ensembl Linkage disequilibrium SNPs in BioMart DAS sources

3 of 25 Single nucleotide polymorphisms (SNPs) Two human genomes differ by ~0.1% Polymorphism: a DNA variation in which each possible sequence is present in at least 1% of people Most polymorphisms (~90%) take the forms of SNPs: variations that involve just one nucleotide ~1 out of every 300 bases in the human genome ~10 million in the human genome

4 of 25 Functional Consequences SNPs in coding area that alter aa sequence SNPs in coding areas that don’t alter aa sequence SNPs in promoter or regulatory regions SNPs in other regions Cause of most monogenic disorders, e.g: Hemochromatosis (HFE) Cystic fibrosis (CFTR) Hemophilia (F8) May affect splicing May affect the level, location or timing of gene expression No direct known impact on phenotype, useful as markers

5 of 25 Practical Applications Disease diagnosis Association studies Pharmacogenomics Forensic testing Population genetics and evolutionary studies Marker-assisted selection

6 of 25 Practical Applications

7 of 25 SNPs in Ensembl Most SNPs imported from dbSNP (rs……): Imported data: alleles, flanking sequences, frequencies, …. Calculated data: position, synonymous status, peptide shift, …. For human also: HGVbase TSC Affy GeneChip 100K and 500K Mapping Array Affy Genome-Wide SNP array 6.0 Ensembl-called SNPs (from Celera reads and Jim Watson’s and Craig Venter’s genomes) For mouse, rat, dog and chicken also: Sanger- and Ensembl-called SNPs (other strains / breeds)

8 of 25 dbSNP Central repository for simple genetic polymorphisms: single-base nucleotide substitutions small-scale multi-base deletions or insertions retroposable element insertions and microsatellite repeat variations For human (dbSNP build 128): 34,434,159 submissions (ss#’s) 11,883,685 RefSNP clusters (rs#’s) 6,262,709 validated 737,679 with frequency

9 of 25 SNPs in Ensembl - Types Non-synonymousIn coding sequence, resulting in an aa change Synonymous In coding sequence, not resulting in an aa change FrameshiftIn coding sequence, resulting in a frameshift Stop lostIn coding sequence, resulting in the loss of a stop codon Stop gainedIn coding sequence, resulting in the gain of a stop codon Essential splice site In the first 2 or the last 2 basepairs of an intron Splice site1-3 bps into an exon or 3-8 bps into an intron UpstreamWithin 5 kb upstream of the 5'-end of a transcript Regulatory regionIn regulatory region annotated by Ensembl 5' UTRIn 5' UTR IntronicIn intron 3' UTRIn 3' UTR DownstreamWithin 5 kb downstream of the 3'-end of a transcript IntergenicMore than 5 kb away from a transcript

10 of 25 SNPs in Ensembl - Species Human Chimp Mouse Rat Dog Cow Platypus Chicken Zebrafish Tetraodon Mosquito

11 of 25 Caveat For human, mouse and rat Ensembl defines all SNP alleles respective to the + strand of the genome assembly! (to be able to merge dbSNP data with Sanger resequencing data) Exceptions: Those cases where SNPs are shown as part of a sequence

12 of 25 A missense SNP, C1858T, in PTPN22 (Tyrosine-protein phosphatase non-receptor type 22) has been identified as a genetic risk factor for rheumatoid arthritis. This SNP is also referred to as R620W. 1.Find the SNPView page for this SNP. 2.Why are the alleles on this page given as A/G? 3.What is the minor allele of this SNP in Caucasians? 5 MINUTE EXERCISE

13 of 25 SNPs in Ensembl GeneSNPView (1) SNP alleles Transcript InterPro domains

14 of 25 SNPs in Ensembl GeneSNPView (2)

15 of 25 SNPs in Ensembl TranscriptSNPView (1) Shows SNP alleles in different: Individuals (human): Celera HuAA, HuCC, HuDD and HuFF, Craig Venter, Jim Watson Strains (mouse, rat) Breeds (chicken, dog)

16 of 25 SNPs in Ensembl TranscriptSNPView (2) Resequencing coverage Alleles in different individuals SNP alleles Different individuals

17 of 25 SNPs in Ensembl TranscriptSNPView (3)

18 of 25 1.Find the TranscriptSNPView page for human PTPN22. 2.Do all individuals (HuAA, HuCC, HuDD, HuFF, Venter and Watson) have resequence coverage at the position of the C1858T (R620W) SNP? 3.Has any of the individuals a higher risk to get rheumatoid arthritis based on its genotype at this position? 4.Is there an individual that is heterozygote at this position? 5 MINUTE EXERCISE

19 of 25 Haplotypes and Linkage Disequilibrium A haplotype is a set of SNPs on a single chromatid that are statistically associated Linkage disequilibrium describes a situation in which some combinations of SNP alleles occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies

20 of 25 Measures of LD D = P(AB) – P(A)P(B) D ranges from – 0.25 to D = 0 indicates linkage equilibrium dependent on allele frequencies, therefore of little use D’ = D / maximum possible value D’ = 1 indicates perfect LD estimates of D’ strongly inflated in small samples r 2 = D 2 / P(A)P(B)P(a)P(b) r 2 = 1 indicates perfect LD measure of choice

21 of 25 Linkage Disequilibrium LDView It is also possible to export SNP information for upload into the HaploView software tool

22 of 25 Linkage Disequilibrium LDTableView

23 of 25 Retrieve all non-synonymous SNPs for the human CFTR gene using BioMart and export their id, genomic position, alleles and peptide shift (hint: which dataset should you start with?). 5 MINUTE EXERCISE

24 of 25 DAS Sources For human, data from the following DAS Sources can be visualised on ContigView: DGV and DGV loci: Structural variations from the Database of Genomic Variations (CNVs, InDels, inversions etc.) RedonCNV regions and RedonCNV loci: Copy number variations from Redon et al. paper SegDup Washu: Segmental Duplications, University of Washington

25 of 25 Q & A Q U E S T I O N S A N S W E R S