Download presentation
Presentation is loading. Please wait.
Published byValerie Bryant Modified over 9 years ago
1
Lab 9: Linkage Disequilibrium
2
Goals 1.Estimation of LD in terms of D, D’ and r 2. 2.Determine effect of random and non-random mating on LD. 3.Estimate LD from diploid genotype data using EM-algorithm.
3
Allele history High drift or Selective sweep Time
4
LD Broken by recombination A1A1 B1B1 A2A2 B2B2 A1A1 B2B2 A2A2 B1B1 A1A1 B1B1 A1A1 B2B2 A1A1 B2B2 A1A1 B1B1
5
Closer proximity -> less recombination -> stronger LD
6
LD estimation in two-locus (A&B) and two- allele (1 & 2) model A1 A2 B1 B2B1 B2 p1 p2 q1 q2q1 q2 GameteObserved gametic frequency Expected gametic frequency under linkage equilibrium AlleleAllele frequency A1B1 x 11 p1q1p1q1 A1p1=x11+x12 A1B2 x 12 p1q2p1q2 A2p2= x21+x22 A2B1 x 21 p2q1p2q1 B1q1= x11+x21 A2B2 x 22 p2q2p2q2 B2q2= x12+x22
7
If D > 0, D max = min(p 1 q 2, p 2 q 1 ) If D < 0, D max = min(p 1 q 1, p 2 q 2 ). Different measures of LD
8
Example GameteObserved gametic frequency Expected gametic frequency under linkage equilibrium A1B1 0.10.3*0.4=0.12 A1B2 0.20.3*0.6=0.18 A2B1 0.30.7*0.4=0.28 A2B2 0.40.7*0.6=0.42 A1B1100 A1B2200 A2B1300 A2B2400 AlleleAllele frequency A1(100+200)/1000=0.3 A2(300+400)/1000=0.7 B1(100+300)/1000=0.4 B2(200+400)/1000=0.6 DDmaxD' 0.1-0.12=-0.020.12=min(0.12, 0.42)-0.167 0.4-0.42=-0.02 0.18-0.2=-0.02 0.28-0.3=-0.02
9
Decay of LD Recombination rate for self-fertilizing organisms: The recombination rate If S = 0, c ef = c If S =>1, c ef => 0
10
a)Calculate D, D’, and r 2, and test the statistical significance of the gametic disequilibrium between the two loci. b)Because the linkage phase of each mother tree was known, Adams and Joly were able to estimate that the recombination rate between the two loci is c = 0.044. i) What is the expected value of D in the next generation (i.e., in the offspring of the seeds that were included in the study)? ii) How many generations of random mating will it take for D to decay below 0.005? iii) What is the expected value of D in the next generation if: S = 0.1? S = 0.5? S = 0.9? c)Repeat the calculations from b) assuming c = 0.5 (i.e., assuming that the two loci are physically unlinked). d)Discuss the relative importance of rates of recombination and self-fertilization in determining the rate of decay of LD. Problem 1. In most conifers, gamete frequencies and the linkage phase of diploid genotypes can be determined directly because seeds contain relatively large amounts of haploid nutritional tissue (called endosperm or megagametophyte), which originates from the maternal gamete. As part of a study of the linkage relationship among allozyme loci in loblolly pine (Pinus taeda), Adams and Joly (1980) sampled 456 gametes at loci phosphoglucose isomerase 2 (PGI2, for simplicity, let this be locus A) and glutamate-oxaloacetate transaminase 1 (GOT1, let this be locus B) and observed the following numbers of gametes. GameteCount A1B1138 A1B288 A2B178 A2B2152 Total456
11
Problem 2. Compare rates of decay of r 2 with physical distance in sequences from the phytochrome B2 (PHYB2) gene in European aspen (Populus tremula) and the phytochrome C (PHYC) gene in Arabidopsis thaliana. a)Show scatter plots with trend lines illustrating the decay of r 2 with physical distance for each gene. b)How do the patterns of LD differ between these two species, and why? (BIOLOGICAL EXPLANATION) c)GRADUATE STUDENTS ONLY: Provide facts and citations supporting your biological explanation.
12
When we genotype, we often don’t know the actual haplotypes – Unphased haplotypes Can use a maximum likelihood method to obtain haplotype frequencies – Expectation Maximization (EM) Haplotypes through EM
13
1.Initialize – Guess the gamete frequencies 2.Expectation Step – Find expected frequencies of known phase genotypes given gamete frequencies 3.Maximization Step – Find expected frequencies of all unphased genotypes given gamete frequencies a.Use to make new gamete frequency estimates where n= # of unphased genotypes in the samples, n1, n2….n5, are the # of times each unphased genotype was observed in the sample, and P1, P2, …., P5 are the expected frequencies of the unphased genotypes in the sample.
14
Problem 3. File human_LD.arp contains data for humans from two populations (Han and Melanesian) genotyped for the same loci you have analyzed for departures from Hardy-Weinberg Equilibrium. The Han sample includes individuals from a broad geographic area in China, whereas the Melanesian sample only includes individuals from the Bougainville Island. Use Arlequin to test for significant linkage disequilibrium among the 10 loci in each of these populations. a)How do you interpret the difference in the number of linked loci in the two populations? (STATISTICAL AND BIOLOGICAL INTERPRETIONS) b)GRADUATE STUDENTS ONLY: How many pairs of loci are expected to show significant LD at α = 0.05 by chance (i.e., if there is no gametic disequilibrium among them in the population)? c)GRADUATE STUDENTS ONLY: Provide facts and citations supporting your biological interpretation of the results.
15
http://en.wikipedia.org/wiki/Melanesia Han
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.