Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific.

Similar presentations


Presentation on theme: "1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific."— Presentation transcript:

1 1 of 32 Sequence Variation in Ensembl

2 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific SNPs

3 3 of 32 Single nucleotide polymorphisms (SNPs) Two human genomes differ by ~0.1% Polymorphism: a DNA variation in which each possible sequence is present in at least 1% of people Most polymorphisms (~90%) take the forms of SNPs: variations that involve just one nucleotide ~1 out of every 300 bases in the human genome ~10 million in the human genome

4 4 of 32 Functional Consequences TypeConsequence SNPs in coding area that alter aa sequence Cause of most monogenic disorders, e.g: Hemochromatosis (HFE) Cystic fibrosis (CFTR) Hemophilia (F8) SNPs in coding areas that don’t alter aa sequence May affect splicing SNPs in promoter or regulatory regions May affect the level, location or timing of gene expression SNPs in other regionsNo direct known impact on phenotype Useful as markers

5 5 of 32 Practical Applications Disease diagnosis Association studies Pharmacogenomics Forensic testing Population genetics and evolutionary studies Marker-assisted selection

6 6 of 32 Practical Applications

7 7 of 32 SNPs in Ensembl Most SNPs imported from dbSNP (rs……): Imported data: alleles, flanking sequences, frequencies, …. Calculated data: position, synonymous status, peptide shift, …. For human also: HGVbase TSC Affy GeneChip 100K and 500K Mapping Array Ensembl-called SNPs (from Celera reads) For mouse and rat also: Sanger- and Ensembl-called SNPs (other strains)

8 8 of 32 dbSNP Central repository for both SNPs and short deletion and insertion polymorphisms http://www.ncbi.nlm.nih.gov/SNP/index.html For human (dbSNP build 127): 31,035.607 submissions (ss#’s) 11,811,594 RefSNP clusters (rs#’s) 5,689,286 validated 5,559,898 with genotype 710,090 with frequency

9 9 of 32 SNPs in Ensembl - Types Non-synonymousIn coding sequence, resulting in an aa change Synonymous In coding sequence, not resulting in an aa change FrameshiftIn coding sequence, resulting in a frameshift Stop lostIn coding sequence, resulting in the loss of a stop codon Stop gainedIn coding sequence, resulting in the gain of a stop codon Essential splice site In the first 2 or the last 2 basepairs of an intron Splice site1-3 bps into an exon or 3-8 bps into an intron UpstreamWithin 5 kb upstream of the 5'-end of a transcript Regulatory regionIn regulatory region annotated by Ensembl 5' UTRIn 5' UTR IntronicIn intron 3' UTRIn 3' UTR DownstreamWithin 5 kb downstream of the 3'-end of a transcript IntergenicMore than 5 kb away from a transcript

10 10 of 32 SNPs in Ensembl - Species Human Chimp Mouse Rat Dog Cow Platypus Chicken Zebrafish Tetraodon Mosquito

11 11 of 32 SNPs in Ensembl MapView: SNP density on chromosome

12 12 of 32 SNPs in Ensembl ContigView: SNPs in genomic context

13 13 of 32 SNPs in Ensembl GeneSeqView: SNPs in genomic sequence

14 14 of 32 SNPs in Ensembl TransView & ProtView: SNPs in transcript/ protein

15 15 of 32 SNPs in Ensembl What SNPs does my gene contain? > GeneSNPView

16 16 of 32 SNPs in Ensembl Info about one specific SNP? > SNPView: SNP Report Genotype and allele frequencies per population Located in transcripts SNP Context Individual genotypes

17 17 of 32 Caveat For human, mouse and rat Ensembl defines all SNP alleles respective to the + strand of the genome assembly! (to be able to merge dbSNP data with Sanger resequencing data) Exceptions: TransView, ProtView and GeneSeqView show alleles as they are in the transcript, protein or strand from which the transcript is transcribed, respectively.

18 18 of 32 Haplotypes and Linkage Disequilibrium A haplotype is a set of SNPs on a single chromatid that are statistically associated Linkage disequilibrium describes a situation in which some combinations of SNP alleles occur more or less frequently in a population than would be expected from a random formation of haplotypes from alleles based on their frequencies

19 19 of 32 Measures of LD D = P(AB) – P(A)P(B) D ranges from – 0.25 to + 0.25 D = 0 indicates linkage equilibrium dependent on allele frequencies, therefore of little use D’ = D / maximum possible value D’ = 1 indicates perfect LD estimates of D’ strongly inflated in small samples r 2 = D 2 / P(A)P(B)P(a)P(b) r 2 = 1 indicates perfect LD measure of choice

20 20 of 32 Linkage Disequilibrium LDView It is also possible to export SNP information for upload into the HaploView software tool

21 21 of 32 Linkage Disequilibrium LDTableView

22 22 of 32 SNPs in BioMart SNP datasets

23 23 of 32 SNPs in BioMart FILTER OUTPUT Ensembl gene datasets

24 24 of 32 SNPs in BioMart Start with a Genes dataset: to retrieve SNPs associated with a particular gene Start with a SNPs dataset to retrieve SNPs located in a certain region

25 25 of 32 HapMap A multi-country effort to identify and catalog genetic similarities and differences in human beings Collaboration among scientists and funding agencies from Japan, the United Kingdom, Canada, China, Nigeria, and the United States All of the information generated by the project is released into the public domain http://www.hapmap.org/

26 26 of 32 HapMap Samples from populations with African, Asian and European ancestry 270 DNA samples from 4 populations: 30 trios (two parents and an adult child) from the Yoruba people of Ibadan, Nigeria 45 unrelated Japanese from the Tokyo area 45 unrelated Han Chinese from Beijing 30 trios from Utah with Northern and Western European ancestry (CEPH)

27 27 of 32 HapMap

28 28 of 32 HapMart

29 29 of 32 Strain-specific SNPs Mice and rats for experimental research are selected from inbred strains in order to allow reproducibility C57BL/6J and BN/SsNHsd/MCW (BN) are the strains selected for the mouse and rat sequencing projects, respectively

30 30 of 32 Strain-specific SNPs TranscriptSNPView Now also available for dog breeds and human individuals (Celera)

31 31 of 32 Strain-specific SNPs

32 32 of 32 Q & A Q U E S T I O N S A N S W E R S


Download ppt "1 of 32 Sequence Variation in Ensembl. 2 of 32 Outline SNPs SNPs in Ensembl Haplotypes & Linkage Disequilibrium SNPs in BioMart HapMap project Strain-specific."

Similar presentations


Ads by Google