Eric Jorgenson Epidemiology 217 2/21/12 Linkage Analysis Eric Jorgenson Epidemiology 217 2/21/12
Worldwide Distribution of Human Earwax SNP rs17822931 Yoshiura et al., Nature Genetics 2006
Geographic Distribution of PTC Phenotype High Low Wooding Genetics 2006, adapted from Cavalli-Sforza 1994
Bimodal Distribution of PTC
Your Phenotypes and Genotypes Taste SNP Ear wax SNP Sample Name taster ear wax rs10246939 rs1726866 rs17822931 BU10 Y D CT AG TT BU12 N None BU14 D? AA BU15 W CC BU17 BU19 GG BU20 BU21 mild BU22 BU23 BU24 W? BU25 BU26 BU27 BU28 BU29 Y mild BU30 From Joe Wiemels
Types of Genetic Studies Family Studies Compare trait values across family members Linkage Analysis Compare trait values with inheritance patterns Association Compare trait values against genetic variants
Family Studies Familial Relationships Phenotype information Twins Siblings Parents/offspring Phenotype information Affected/Unaffected (Prostate Cancer) Quantitative measure (Blood Pressure) No Genotype information required
Why do Family Studies? Is the trait genetic? What is the mode of transmission? Dominant Recessive Additive Polygenic (Multiple genes involved)
Mutation and Meiosis
Recessive trait
First PTC Family Study L. H. Snyder Science 1931 Both: 15% One: 32% Neither: 100% L. H. Snyder Science 1931
Linkage Analysis Narrow down position of disease gene No biological knowledge needed Genetic markers (not disease gene) Recombination
Recombination a A a a b B b b A a a a A a a a b b b b B b B b 2 markers-A and B Know the orientation A a a a A a a a b b b b B b B b
Recombination a A a a b B b b R NR NR R A a a a A a a a b b b b B b B
Independent Assortment b B b b 25% 25% 25% 25% A a a a A a a a b b b b B b B b
No recombination a A a a b B b b 0% 50% 50% 0% A a a a A a a a b b b b
Recombination Fraction 61 420 442 77 A a a a A a a a b b b b B b B b
Recombination Fraction Recombination Fraction q = Recombinants / Total = 61 + 77 / 61 + 77 + 442 + 420 = 138 / 1000 = 13.8% 61 420 442 77 A a a a A a a a b b b b B b B b
Linkage Linkage Analysis Recombination fraction q < 50% Two traits: PTC and KELL blood group Two genetic markers One trait and one genetic marker Linkage Analysis
Human Linkage Analysis RFLP Markers for Linkage (1980) Huntington’s Disease Linkage (1983) Cystic Fibrosis Linkage (1985) Cystic Fibrosis Gene (1989) Huntington’s Disease Gene (1993)
Genomewide Linkage Analysis Genetic Markers q = 10% on average Genes
Linkage Analysis LOD score based on recombination LOD (q) = log (q)R (1 - q)NR ____________________ (q = 1/2) R + NR
Dominant Trait D d d d 1 2 3 3 D d D d d d 1 3 2 3 2 3
Linkage Analysis 1 2 3 3 R NR NR 1 3 2 3 2 3
LOD score LOD (q) = log (q)1 (1 - q)2 q ____________________ 0.01 -1.11 0.05 -0.44 0.1 -0.19 0.2 0.3 0.07 0.4 0.06 0.5 0.00
IBD Identity by descent Allele Sharing methods Often used for affected sib pairs
Identity By Descent a A a A 25% 25% 25% 25% A A a A A a a a
Identity By Descent a A A A a A A a a a Parent 1 Alleles shared IBD 1 1 1 1 A A a A A a a a
Identity By Descent a A A A a A A a a a Parent 1 1 1 1 1 2 0% 1 100% Alleles IBD Frequency 2 0% 1 100% a A 1 1 1 1 A A a A A a a a
Identity By Descent A A A A a A A a a a Sibling 1 Alleles shared IBD 2 1 1 0 A A a A A a a a
Identity By Descent A A A A a A A a a a Sibling 1 2 1 1 0 2 25% 1 50% Alleles IBD Frequency 2 25% 1 50% A A 2 1 1 0 A A a A A a a a
Identity By Descent IBD can be used for linkage analysis Expect 50% alleles shared between siblings Look for IBD > 50% for concordant pairs Look for IBD < 50% for discordant pairs
PTC Linkage Analysis Utah Genetic Reference Project 27 large families 269 subjects
PTC Linkage Analysis
Human Chromosomes 23 pairs of chromosomes TAS2R38
Fine Mapping Linkage markers Genes Kim et al. Science 2003
Linkage Disequilibrium
Linkage Disequilibrium
Linkage Disequilibrium
Linkage Disequilibrium
Linkage Disequilibrium Time
Linkage Disequilibrium Mapping Genetic Markers Genes
PTC Linkage Disequilibrium Mapping Kim et al. Science 2003
TAS2R38 Receptor Structure Kim et al. J Dent Res 2004
3 SNPs in the TAS2R38 Gene P A V P A I P V V P V I A A V A A I A V V Haplotype definition Each individual has two haplotypesdiplotype Haploytpeallele diplotypegenotype A A I A V V A V I
TAS2R38 Diplotype and PTC Score 2 haplotypes3 diplotypes AVI = 2 PAV = 10 Heterozygote = 9 A multivariate analysis explains most of the variance with a p-value of < 10-33 Kim et al. Science 2003
Confirm Mode of Inheritance Both: 15% One: 32% Neither: 100% L. H. Snyder Science 1931 47
Explain Linkage Signal
Geographic Distribution of PTC Phenotype Wooding Genetics 2006, adapted from Cavalli-Sforza 1994
Geographic Distribution of PTC Haplotypes Kim et al. Science 2003
Diplotype and PTC Score You may notice a 4th diplotype due to a 3rd, rare haplotype. Kim et al. Science 2003
3 SNPs form 3 Haplotypes P A V A V I A A V Taster Non-taster Rare 3rd haplotype is the result of recombination. A of non-taster AV of taster Allows us to compare the effect of the 1st SNP vs. the 2nd and 3rd. Rare-not in all combinations Rare A A V
Comparing Diplotypes Mean for taster is 9 Mean for rare haplotype is 7. Difference is P vs. A Mean for non-taster is 2 Difference is AV vs. VI Both have an effect, but 2nd and 3rd have a greater effect.
Predicted Effect of the 3 SNPs 3 a.a. substitution matrices Sequence alignment Scale Chemical change 1st and 2nd most severe.
TAS2R38 Haplotype Function
PTC Diplotype and Taste Sandell and Breslin Current Biology 2006
Next Week Next Generation Sequencing
Appendix: Phase Unknown Linkage
Phase Unknown ? ? ? ? ? ? ? ? ? ? D d d d 1 2 3 3 D d D d d d 1 3 2 3 ? ? ? ? Phase Unknown ? ? ? ? ? ? D d d d 1 2 3 3 D d D d d d 1 3 2 3 2 3
Phase Unknown 1 2 3 3 ? ? ? 1 3 2 3 2 3
What if we don’t know phase? We calculate the LOD score for each phase Divide by 2
Phase Unknown + = -0.02 for q = 0.44 LOD (q) = ½ log (q)1 (1 - q)2 ____________________ (q = 1/2) 1 + 2 + ½ log (q)2 (1 – q)1 (q = 1/2) 2 + 1 = -0.02 for q = 0.44