Download presentation
Presentation is loading. Please wait.
Published byShanna Morton Modified over 9 years ago
1
Lecture 13: Linkage Analysis VI Date: 10/08/02 Complex models Pedigrees Elston-Stewart Algorithm Lander-Green Algorithm
2
Complex Linkage Models The simplest linkage models involve only pairwise recombination fractions ij or adjacent map distances m i,i+1 and map function parameters. Such models are insufficient to describe many real-life data scenarios.
3
For Example Incomplete penetrance. Differential penetrance. Genetic imprinting. No available controlled and repeated crosses.
4
Inference on Pedigrees Pedigrees are extended families sampled from a natural population. They are used when one cannot set up repeated and controlled crosses. Unknown phenotypes. Unknown genotypes. Founders.
5
Ordered vs. Unordered Genotype An unordered genotype does not include phase information nor parental source of alleles. An ordered genotype includes phase information and parental source of alleles. Unordered GenotypeOrdered Genotype(s) A1A2B1B1A1A2B1B1 A 1 B 1 /A 2 B 1 A 2 B 1 /A 1 B 1
6
Penetrance Parameters A penetrance parameter is introduced in the model to explain the relationship between genotype and phenotype. We code the phenotype as a random vector of discrete or continuous variables, e.g. X=(X 1, X 2,..., X m ). The phenotype X i of an individual i is conditionally independent of all other family members given his/her genotype and other characteristics (sex, age, etc).
7
Penetrance Parameters - Assumptions We assume individual i’s phenotype is a single number (discrete or continuous) conditionally independent of all other genotypes and loci, once we condition on the genotype at a particular locus. i.e. we assume one phenotypic variable per locus. This assumption forces us to ignore multilocus phenotypes and pleiotropic loci.
8
Conditional Likelihood of Observed Phenotypes The conditional independence implies that the likelihood of particular phenotypes observed on a pedigree, conditional on the observed genotypes, is simply a product.
9
Penetrance Parameters: Simple Dominant Disease Dominant Disease (A 1 > A 2 ) Ordered Genotype P(X i | G i, C i ) = P(X i | G i ) A1A1A1A1 1 A1A2A1A2 1 A2A1A2A1 1 A2A2A2A2 0
10
Penetrance Parameters: Dominant Disease with C Dominant Disease (A 1 > A 2 ) but Sex-Dependent Ordered Genotype P(X i | G i, male)P(X i | G i, female) A1A1A1A1 10 A1A2A1A2 10 A2A1A2A1 10 A2A2A2A2 00
11
Liability Classes Classes of individuals who differ in penetrance parameters are called liability classes. In one of the examples above males and females form two different liability classes.
12
Incomplete Penetrance with Liability Classes Suppose that a dominant disease affects individuals under 30 with probability a and individuals above 30 with probability b. ClassAAAaaa <30 years >=30 years
13
Penetrance Parameters: Phenocopies Dominant Disease (A 1 > A 2 ) with Phenocopy Rate pr Ordered GenotypeP(X i | G i ) A1A1A1A1 1 A1A2A1A2 1 A2A1A2A1 1 A2A2A2A2 pr
14
Dealing with Penetrance and Phenocopies Biological solution. Identify features that differentiate genetic and non-genetic forms of the phenotype. Then, the phenotype can be recoded as fully-penetrant with no phenocopies. Approximation. Estimate genotype-specific risk from segregation ratios observed in a family, then set penetrance to the estimates.
15
Example GenotypeExpected Frequency Observed Frequency AA0.50.75 Aa0.50.25 50% of Aa are phenocopies of AA. Or there is only 50% penetrance of the a allele.
16
Penetrance Parameters – More Assumptions Unless a phenotype is affected by genomic imprinting, we usually assume that different ordered genotypes with the same alleles have the same phenotype. Genomic imprinting means that the parental origin of the allele affects its expression. For example, a gene may only express if it came from your mother.
17
Genetic Imprinting in Humans? Prader-Willi syndrome causes morbid obesity in humans. The disease loci are found on chromosome 15 and working copies must be transmitted from father. Angelman Syndrome causes development problems including speech impairment and balance disorder. It is caused by a piece of chromosome 15 that is normally activated only on the maternal chromosome.
18
Problem: Ordered Genotypes are not Observed Pedigrees almost invariably include missing data, members who have no known genotype. In addition, there will always be many members for which phase and paternal origin cannot be determined. In essence, G is not actually observed.
19
Transmission Parameters The genotypes in a pedigree are related through genetic inheritance. Conditional on the parental genotypes, the offspring genotypes are independent of all other members in the pedigree. Transmission parameters are those parameters which determine the transmission of genes: the recombination fractions.
20
Independence of Transmission Probabilities Let G k be the genotype of offspring k. Let G kM be the allele transmitted by the offspring’s mother and G kP be the allele transmitted by the father. Then,
21
Maternal Transmission: Generate Haplotype MP 1 1 1 Z
22
Maternal Transmission: Transmit Haplotype
23
Population Parameters What about the pedigree members that have no parents? There are no parental genotypes on which to condition. The distribution of genotypes in these individuals are determined by the so-called population parameters. In the worst case, this would require (m 1 m 2...m l ) 2 -1 independent parameters, where m i is the number of alleles at locus i.
24
Population Parameters - Assumptions Assume Hardy-Weinberg equilibrium (random union of haplotypes) so that the genotype frequencies are determined by the haplotype frequencies. Then there are (m 1 m 2...m l )-1 independent parameters. Assume linkage equilibrium (random union of alleles at multiple loci into haplotypes). Then there are m 1 + m 2 +... + m l – l independent allele frequencies.
25
Overall Genotype Probabilities
26
Computation There are (m 1 m 2...m l ) 2n terms in the summation. There are 2n probabilities in each product. Thus, there are (m 1 m 2...m l ) 2n (2n-1) multiplications and (m 1 m 2...m l ) 2n -1 additions. The calculation grows exponentially in number of loci l and number of individuals n.
27
Elston-Stewart Algorithm Algorithm is similar to computation for Hidden Markov Models based on Forward- Backward algorithm. The hidden states are the genotypes. One must classify people as falling ahead of or behind other people, i.e. we need a linear arrangement of people in the pedigree.
28
Ordering People in a Pedigree k
29
Forward/Backwards Probabilities G1G1 G2G2 GkGk X1X1 X2X2 XkXk... G k+1 X k+1...
30
Total Probability
31
Calculating Forward Probability
32
Calculating Backward Probabilities
33
Example 4 AA aa Aa aa Aa aa 1 2 356 78 9
34
Using 5 as Proband
35
Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9
36
Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9
37
Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9
38
Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9
39
Forward Probabilities: Founders
40
Backward Probabilities: Leaves
41
Examples – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9
42
Backward Probability 4 1 means affected 0 means not affected
43
Example – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9
44
Forward Probability 5
45
Example – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9
46
Backward Probability 5
47
Example – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9
48
Example – Final Calculation
49
Efficiency of the Elston- Stewart Algorithm In our example, each genotype was defined without ambiguity. There were no sums over genotypes. In general, this is not true and the forward and backward probabilities must sum over the possible parental genotypes or spousal genotypes respectively. The ES algorithm calculations increase exponentially with respect to the number of genotypes. Fortunately, the ES algorithm calculations only increase linearly in the number of pedigree members.
50
Lander-Green Algorithm View the pedigree as a Hidden Markov model on haplotypes. Pattern of inheritance at a single locus is described by v a 2(n – f)-long vector of 0’s and 1’s indicating if allele is paternal (0) or maternal (1) in origin. There are 2 2(n-f) such inheritance vectors possible.
51
Inheritance Vector v 4 AA aa aA aa Aa 1 2 356 78 9 aa Gametev 4M0|1 4P0|1 5M0|1 5P0|1 7M1 7P0|1 8M0 8P0|1 9M0 9P0|1
52
Conditional Probability Prior to viewing the data, all inheritance vectors are equally likely.
53
Multiple Loci Suppose there are l loci. Then, the joint probability can be factored But, conditional on the v i, X i is independent of all X j with j<i.
54
Multiple Loci (cont) And, conditional on the inheritance vectors of preceding loci, the inheritance vector at locus i is independent of all but the immediately preceding inheritance vector.
55
Multiple Loci (cont)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.