Presentation is loading. Please wait.

Presentation is loading. Please wait.

Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail.

Similar presentations


Presentation on theme: "Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail."— Presentation transcript:

1 Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail

2 Association studies Disease Responder Control Non-responder Allele 0Allele 1 Marker A is associated with Phenotype Marker A: Allele 0 = Allele 1 =

3 Association studies Evaluate whether nucleotide polymorphisms associate with phenotype TA GA A CG GA A CG TA A TA TC G TG TA G TG GA G

4 TA GA A CG GA A CG TA A TA TC G TG TA G TG GA G Association studies

5 Hypothesis – Haplotype Blocks? The genome consists largely of blocks of common SNPs with relatively little recombination within the blocks  Patil et al., Science, 2001;  Jeffreys et al., Nature Genetics, 2001;  Daly et al., Nature Genetics, 2001

6 Sense genes Antisense genes 200 kb 1234 DNA SNPs Haplotype blocks Haplotype Block Structure LD-Blocks, and 4-Gamete Test Blocks

7 One definition of block Based on the Four Gamete test. Intuition: when between two SNPs there are all four gametes, there is a recombination point somewhere inbetween the two sites

8 Four Gamete Block Test Hudson and Kaplan 1985 A segment of SNPs is a block if between every pair of SNPs at most 3 out of the 4 gametes (00, 01,10,11) are observed. 0 0 1 0 1 1 1 1 0 1 1 1 0 0 1 0 1 1 1 1 0 1 0 1 BLOCKVIOLATES THE BLOCK DEFINITION

9 Finding Recombination Hotspots: Many Possible Partitions into Blocks A C T A G A T A G C C T G T T C G A C A A C A T A C T C T A T G A T C G G T T A T A C G A C A T A C T C T A T A G T A T A C T A G C T G G C A T All four gametes are present:

10 A C T A G A T A G C C T G T T C G A C A A C A T A C T C T A T G A T C G G T T A T A C G A C A T A C T C T A T A G T A T A C T A G C T G G C A T Find the left-most right endpoint of any constraint and mark the site before it a recombination site. Eliminate any constraints crossing that site. Repeat until all constraints are gone. The final result is a minimum-size set of sites crossing all constraints.

11 Tagging SNPs ACGATCGATCATGAT GGTGATTGCATCGAT ACGATCGGGCTTCCG ACGATCGGCATCCCG GGTGATTATCATGAT A------A---TG-- G------G---CG-- A------G---TC-- A------G---CC-- G------A---TG-- An example of real data set and its haplotype block structure. Colors refer to the founding population, one color for each founding haplotype Only 4 SNPs are needed to tag all the different haplotypes

12 Informativeness A measure for the “information” a SNP contains about about another SNP. Useful for designing SNPs Arrays and Tagging SNPs selection. 01 00 1 01 10 0 s h2h2 h1h1

13 10 00 0 01 00 1 01 10 0 10 11 1 s 1 s 2 s 3 s 4 s 5 I(s 1,s 2 ) = 2/4 = 1/2 Informativeness

14 10 00 0 01 00 1 01 10 0 10 11 1 s 1 s 2 s 3 s 4 s 5 I({s 1,s 2 }, s 4 ) = 3/4 Informativeness

15 10 00 0 01 00 1 01 10 0 10 11 1 s 1 s 2 s 3 s 4 s 5 I({s 3,s 4 },{s 1,s 2,s 5 }) = 3 S={s 3,s 4 } is a Minimal Informative Subset Informativeness

16 Minimum Set Cover = Minimum Informative Subset s1s1 s2s2 s5s5 s3s3 s4s4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 SNPs Edges 10 00 0 01 00 1 01 10 0 1 0 1 1 1 s1s1 s2s2 s3s3 s4s4 s5s5 Graph theory insight Informativeness

17 Minimum Set Cover {s 3, s 4 } = Minimum Informative Subset s1s1 s2s2 s5s5 s3s3 s4s4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 SNPsEdges 10 00 0 01 00 1 01 10 0 1 0 1 1 1 s1s1 s2s2 s3s3 s4s4 s5s5 Informativeness Graph theory insight


Download ppt "Association Studies, Haplotype Blocks and Tagging SNPs Prof. Sorin Istrail."

Similar presentations


Ads by Google