Presentation is loading. Please wait.

Presentation is loading. Please wait.

General methods of SNP discovery: PolyBayes Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467.

Similar presentations


Presentation on theme: "General methods of SNP discovery: PolyBayes Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467."— Presentation transcript:

1 General methods of SNP discovery: PolyBayes Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467

2 General methods of SNP mining – PolyBayes 2. Use sequence quality information (base quality values) to distinguish true mismatches from sequencing errors sequencing errortrue polymorphism 1. Utilize the genome reference sequence as a template to organize other sequence fragments from arbitrary sources Two innovative ideas:

3 Computational SNP mining – PolyBayes sequence clustering simplifies to database search with genome reference paralog filtering by counting mismatches weighed by quality values multiple alignment by anchoring fragments to genome reference SNP detection by differentiating true polymorphism from sequencing error using quality values

4 Sequence clustering Clustering simplifies to search against sequence database to recruit relevant sequences cluster 1cluster 2cluster 3 genome reference fragments Clusters = groups of overlapping sequence fragments matching the genome reference

5 (Anchored) multiple alignment Advantages efficient -- only involves pair-wise comparisons accurate -- correctly aligns alternatively spliced ESTs The genomic reference sequence serves as an anchor fragments pair-wise aligned to genomic sequence insertions are propagated – “sequence padding”

6 Paralog filtering The “paralog problem” unrecognized paralogs give rise to spurious SNP predictions SNPs in duplicated regions may be useless for genotyping Paralogous difference Sequencing errors Challenge to differentiate between sequencing errors and paralogous difference

7 SNP detection Goal: to discern true variation from sequencing error sequencing errorpolymorphism

8 genome reference sequence 1. Fragment recruitment (database search) 2. Anchored alignment 3. Paralog identification 4. SNP detection SNP discovery with PolyBayes

9 probability of polymorphism base call, base quality a priori polymorphism rate base composition depth of coverage 1. The algorithm Bayesian-statistical SNP detection


Download ppt "General methods of SNP discovery: PolyBayes Gabor T. Marth Department of Biology Boston College Chestnut Hill, MA 02467."

Similar presentations


Ads by Google