Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail.

Similar presentations


Presentation on theme: "Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail."— Presentation transcript:

1 Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail

2 Association studies Disease Responder Control Non-responder Allele 0Allele 1 Marker A is associated with Phenotype Marker A: Allele 0 = Allele 1 =

3 Association studies Evaluate whether nucleotide polymorphisms associate with phenotype TA GA A CG GA A CG TA A TA TC G TG TA G TG GA G

4 TA GA A CG GA A CG TA A TA TC G TG TA G TG GA G Association studies

5 Hypothesis – Haplotype Blocks? The genome consists largely of blocks of common SNPs with relatively little recombination within the blocks  Patil et al., Science, 2001;  Jeffreys et al., Nature Genetics, 2001;  Daly et al., Nature Genetics, 2001

6 Sense genes Antisense genes 200 kb 1234 DNA SNPs Haplotype blocks Haplotype Block Structure LD-Blocks, and 4-Gamete Test Blocks

7 One definition of block Based on the Four Gamete test. Intuition: when between two SNPs there are all four gametes, there is a recombination point somewhere inbetween the two sites

8 Four Gamete Block Test Hudson and Kaplan 1985 A segment of SNPs is a block if between every pair of SNPs at most 3 out of the 4 gametes (00, 01,10,11) are observed BLOCKVIOLATES THE BLOCK DEFINITION

9 Finding Recombination Hotspots: Many Possible Partitions into Blocks A C T A G A T A G C C T G T T C G A C A A C A T A C T C T A T G A T C G G T T A T A C G A C A T A C T C T A T A G T A T A C T A G C T G G C A T All four gametes are present:

10 A C T A G A T A G C C T G T T C G A C A A C A T A C T C T A T G A T C G G T T A T A C G A C A T A C T C T A T A G T A T A C T A G C T G G C A T Find the left-most right endpoint of any constraint and mark the site before it a recombination site. Eliminate any constraints crossing that site. Repeat until all constraints are gone. The final result is a minimum-size set of sites crossing all constraints.

11 Tagging SNPs ACGATCGATCATGAT GGTGATTGCATCGAT ACGATCGGGCTTCCG ACGATCGGCATCCCG GGTGATTATCATGAT A------A---TG-- G------G---CG-- A------G---TC-- A------G---CC-- G------A---TG-- An example of real data set and its haplotype block structure. Colors refer to the founding population, one color for each founding haplotype Only 4 SNPs are needed to tag all the different haplotypes

12 Informativeness A measure for the “information” a SNP contains about about another SNP. Useful for designing SNPs Arrays and Tagging SNPs selection s h2h2 h1h1

13 s 1 s 2 s 3 s 4 s 5 I(s 1,s 2 ) = 2/4 = 1/2 Informativeness

14 s 1 s 2 s 3 s 4 s 5 I({s 1,s 2 }, s 4 ) = 3/4 Informativeness

15 s 1 s 2 s 3 s 4 s 5 I({s 3,s 4 },{s 1,s 2,s 5 }) = 3 S={s 3,s 4 } is a Minimal Informative Subset Informativeness

16 Minimum Set Cover = Minimum Informative Subset s1s1 s2s2 s5s5 s3s3 s4s4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 SNPs Edges s1s1 s2s2 s3s3 s4s4 s5s5 Graph theory insight Informativeness

17 Minimum Set Cover {s 3, s 4 } = Minimum Informative Subset s1s1 s2s2 s5s5 s3s3 s4s4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 SNPsEdges s1s1 s2s2 s3s3 s4s4 s5s5 Informativeness Graph theory insight


Download ppt "Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail."

Similar presentations


Ads by Google