Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm.

Similar presentations


Presentation on theme: "Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm."— Presentation transcript:

1 Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm

2 Complex Linkage Models  The simplest linkage models involve only pairwise recombination fractions  ij or adjacent map distances m i,i+1 and map function parameters.  Such models are insufficient to describe many real-life data scenarios.

3 For Example  Incomplete penetrance. Differential penetrance.  Genetic imprinting.  No available controlled and repeated crosses.

4 Inference on Pedigrees  Pedigrees are extended families sampled from a natural population. They are used when one cannot set up repeated and controlled crosses.  Unknown phenotypes.  Unknown genotypes.  Founders.

5 Ordered vs. Unordered Genotype  An unordered genotype does not include phase information nor parental source of alleles.  An ordered genotype includes phase information and parental source of alleles. Unordered GenotypeOrdered Genotype(s) A1A2B1B1A1A2B1B1 A 1 B 1 /A 2 B 1 A 2 B 1 /A 1 B 1

6 Penetrance Parameters  A penetrance parameter is introduced in the model to explain the relationship between genotype and phenotype.  We code the phenotype as a random vector of discrete or continuous variables, e.g. X=(X 1, X 2,..., X m ).  The phenotype X i of an individual i is conditionally independent of all other family members given his/her genotype and other characteristics (sex, age, etc).

7 Penetrance Parameters - Assumptions  We assume individual i’s phenotype is a single number (discrete or continuous) conditionally independent of all other genotypes and loci, once we condition on the genotype at a particular locus. i.e. we assume one phenotypic variable per locus.  This assumption forces us to ignore multilocus phenotypes and pleiotropic loci.

8 Conditional Likelihood of Observed Phenotypes  The conditional independence implies that the likelihood of particular phenotypes observed on a pedigree, conditional on the observed genotypes, is simply a product.

9 Penetrance Parameters: Simple Dominant Disease Dominant Disease (A 1 > A 2 ) Ordered Genotype P(X i | G i, C i ) = P(X i | G i ) A1A1A1A1 1 A1A2A1A2 1 A2A1A2A1 1 A2A2A2A2 0

10 Penetrance Parameters: Dominant Disease with C Dominant Disease (A 1 > A 2 ) but Sex-Dependent Ordered Genotype P(X i | G i, male)P(X i | G i, female) A1A1A1A1 10 A1A2A1A2 10 A2A1A2A1 10 A2A2A2A2 00

11 Liability Classes  Classes of individuals who differ in penetrance parameters are called liability classes.  In one of the examples above males and females form two different liability classes.

12 Incomplete Penetrance with Liability Classes  Suppose that a dominant disease affects individuals under 30 with probability a and individuals above 30 with probability b. ClassAAAaaa <30 years >=30 years

13 Penetrance Parameters: Phenocopies Dominant Disease (A 1 > A 2 ) with Phenocopy Rate pr Ordered GenotypeP(X i | G i ) A1A1A1A1 1 A1A2A1A2 1 A2A1A2A1 1 A2A2A2A2 pr

14 Dealing with Penetrance and Phenocopies  Biological solution. Identify features that differentiate genetic and non-genetic forms of the phenotype. Then, the phenotype can be recoded as fully-penetrant with no phenocopies.  Approximation. Estimate genotype-specific risk from segregation ratios observed in a family, then set penetrance to the estimates.

15 Example GenotypeExpected Frequency Observed Frequency AA0.50.75 Aa0.50.25 50% of Aa are phenocopies of AA. Or there is only 50% penetrance of the a allele.

16 Penetrance Parameters – More Assumptions  Unless a phenotype is affected by genomic imprinting, we usually assume that different ordered genotypes with the same alleles have the same phenotype.  Genomic imprinting means that the parental origin of the allele affects its expression. For example, a gene may only express if it came from your mother.

17 Genetic Imprinting in Humans?  Prader-Willi syndrome causes morbid obesity in humans. The disease loci are found on chromosome 15 and working copies must be transmitted from father.  Angelman Syndrome causes development problems including speech impairment and balance disorder. It is caused by a piece of chromosome 15 that is normally activated only on the maternal chromosome.

18 Problem: Ordered Genotypes are not Observed  Pedigrees almost invariably include missing data, members who have no known genotype.  In addition, there will always be many members for which phase and paternal origin cannot be determined.  In essence, G is not actually observed.

19 Transmission Parameters  The genotypes in a pedigree are related through genetic inheritance.  Conditional on the parental genotypes, the offspring genotypes are independent of all other members in the pedigree.  Transmission parameters are those parameters which determine the transmission of genes: the recombination fractions.

20 Independence of Transmission Probabilities  Let G k be the genotype of offspring k. Let G kM be the allele transmitted by the offspring’s mother and G kP be the allele transmitted by the father. Then,

21 Maternal Transmission: Generate Haplotype MP 1 1 1 Z

22 Maternal Transmission: Transmit Haplotype

23 Population Parameters  What about the pedigree members that have no parents? There are no parental genotypes on which to condition.  The distribution of genotypes in these individuals are determined by the so-called population parameters.  In the worst case, this would require (m 1 m 2...m l ) 2 -1 independent parameters, where m i is the number of alleles at locus i.

24 Population Parameters - Assumptions  Assume Hardy-Weinberg equilibrium (random union of haplotypes) so that the genotype frequencies are determined by the haplotype frequencies. Then there are (m 1 m 2...m l )-1 independent parameters.  Assume linkage equilibrium (random union of alleles at multiple loci into haplotypes). Then there are m 1 + m 2 +... + m l – l independent allele frequencies.

25 Overall Genotype Probabilities

26 Computation  There are (m 1 m 2...m l ) 2n terms in the summation.  There are 2n probabilities in each product.  Thus, there are (m 1 m 2...m l ) 2n (2n-1) multiplications and (m 1 m 2...m l ) 2n -1 additions.  The calculation grows exponentially in number of loci l and number of individuals n.

27 Elston-Stewart Algorithm  Algorithm is similar to computation for Hidden Markov Models based on Forward- Backward algorithm. The hidden states are the genotypes.  One must classify people as falling ahead of or behind other people, i.e. we need a linear arrangement of people in the pedigree.

28 Ordering People in a Pedigree k

29 Forward/Backwards Probabilities G1G1 G2G2 GkGk X1X1 X2X2 XkXk... G k+1 X k+1...

30 Total Probability

31 Calculating Forward Probability

32 Calculating Backward Probabilities

33 Example 4 AA aa Aa aa Aa aa 1 2 356 78 9

34 Using 5 as Proband

35 Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9

36 Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9

37 Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9

38 Example – Calculations Needed 4 AA aa Aa aa Aa aa 1 2 356 78 9

39 Forward Probabilities: Founders

40 Backward Probabilities: Leaves

41 Examples – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9

42 Backward Probability  4 1 means affected 0 means not affected

43 Example – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9

44 Forward Probability  5

45 Example – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9

46 Backward Probability  5

47 Example – Calculations Completed 4 AA aa Aa aa Aa aa 1 2 356 78 9

48 Example – Final Calculation

49 Efficiency of the Elston- Stewart Algorithm  In our example, each genotype was defined without ambiguity. There were no sums over genotypes.  In general, this is not true and the forward and backward probabilities must sum over the possible parental genotypes or spousal genotypes respectively.  The ES algorithm calculations increase exponentially with respect to the number of genotypes.  Fortunately, the ES algorithm calculations only increase linearly in the number of pedigree members.

50 Lander-Green Algorithm  View the pedigree as a Hidden Markov model on haplotypes.  Pattern of inheritance at a single locus is described by v a 2(n – f)-long vector of 0’s and 1’s indicating if allele is paternal (0) or maternal (1) in origin.  There are 2 2(n-f) such inheritance vectors possible.

51 Inheritance Vector v 4 AA aa aA aa Aa 1 2 356 78 9 aa Gametev 4M0|1 4P0|1 5M0|1 5P0|1 7M1 7P0|1 8M0 8P0|1 9M0 9P0|1

52 Conditional Probability  Prior to viewing the data, all inheritance vectors are equally likely.

53 Multiple Loci  Suppose there are l loci. Then, the joint probability can be factored  But, conditional on the v i, X i is independent of all X j with j<i.

54 Multiple Loci (cont)  And, conditional on the inheritance vectors of preceding loci, the inheritance vector at locus i is independent of all but the immediately preceding inheritance vector.

55 Multiple Loci (cont)


Download ppt "Lecture 13: Linkage Analysis VI Date: 10/08/02  Complex models  Pedigrees  Elston-Stewart Algorithm  Lander-Green Algorithm."

Similar presentations


Ads by Google