Presentation is loading. Please wait.

Presentation is loading. Please wait.

SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class.

Similar presentations


Presentation on theme: "SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class."— Presentation transcript:

1 SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class

2 Reduced Representation Sequencing
600 500 400 clone analyze

3 Random Shotgun Reads Aligned to Whole Genome
entire genome reads

4 Overlapping BAC Clones

5 ESTs may contain multiple exons
whole genome ESTs may contain multiple exons - may be alternative splice variants of a single gene

6 Results: Validations: 1.42 million SNPs throughout the human genome
average density = 1 SNP/ 1.3kb Validations: Random samples of SNPs evaluated in independent population samples to allele frequency 95% - polymorphic 4% - non-polymorphic (false positives) 1% - uniformly heterozygous Random samples of SNPs studied in different ethnic groups 82% - polymorphic in at least one ethnic group (>10% allele frequency) 77% - polymorphic in at least one ethnic group (>20% allele frequency)

7 Heterozygosity (π) of Chromosomes
Chromosome π (x10-4) Y X remaining avg = 7.65

8 Nucleotide Diversity GC content, heterozygosity
HLA locus on chromosome 6 highly heterozygous

9 SNPs in the public domain: how useful are they?
Allele frequencies of SNPs found in dbSNP TSC SNPs Overlap SNPs # STSs that failed / (5.3%) / (14.8%) PCR and sequencing Total characterized SNPs not detecteda / (17.3%) / (16.8%) Uncommon SNPsb (6.0%) (7.1%) Common SNPs in ≥1 populationc (76.7%) (76.1%) Common SNPs in ≥2 populationc (52.4%) (54.3%) Common SNPs in ≥3 populationc (27.0%) (26.9%) amonomorphic (only one of the 2 predicted alleles found in all 3 populations bminor allele frequency <20% in all 3 populations ca SNP is “common” when minor allele frequency is ≥20% Marth et al., 2001

10 Conclusions For researchers interested in using the publicly available
candidate SNPs: 66-70% chance that SNPs have <20% minor allele frequency 50% chance that SNPs have ≥20% minor allele frequency Major Concern: candidate SNPs may not be true polymorphisms but rather duplicated regions of the genome with near-identical sequences

11 Quality and Completeness of
SNP Databases # SNPs identified %validated by resequencing All SNPs % Double-hit SNPs % 12% non-validation rate may be due to: false-positive SNPs (errors in construction of SNP databases) rare variants false-negative SNPs (SNPs missed in resequencing) population-specific SNPs Different rates of non-validation reported in other surveys may reflect true SNPs that were missed or inherent differences between the different sets of genes Double hits are SNPs for which both alleles have been seen more than once and therefore, more common and more ideal for the construction of haplotype maps


Download ppt "SNP Detection Congtam Pham 2/24/04 Dr. Marth’s Class."

Similar presentations


Ads by Google