Presentation is loading. Please wait.

Presentation is loading. Please wait.

Complex Trait Genetics in Animal Models

Similar presentations


Presentation on theme: "Complex Trait Genetics in Animal Models"— Presentation transcript:

1 Complex Trait Genetics in Animal Models
Will Valdar Oxford University

2 Mapping Genes for Quantitative Traits in Outbred Mice
Will Valdar Oxford University

3 What’s so great about mice?
Share ~99% of genes with humans ~90% of the two genomes can be portioned into regions of conserved synteny Shorter lifespans You can do invasive experiments You can breed them as you like – control the genetics

4 What is an inbred strain?
F20 generation F = 99% BALB/c

5

6 one inbred strain

7 two inbred strains

8 Mouse model of anxiety Mice are not natural predators, they are prey to larger animals. And as such, they like small dark places. This is an open field arena, which is a brightly lit open box, which we would consider would be anxiety-producing to a mouse. We placed the mouse into this arena, and we use a video tracking device to monitor its behaviour.

9 Mouse model of anxiety Anxious mouse Non anxious mouse
On the right, I show the movements of the mouse. The red lines indicate movement and the green spots indicate freezing positions. You can see that the mouse in the top panel is thoroughly exploring it’s new territory, moving all about the enclosure including the central area away from the relatively “safe” walls of the open field arena. On the bottom panel, you can see a completely contrasting behaviour. The mouse has entered through the door on the right hand side, and has frozen as soon as it has entered the arena, it is not exploring it’s new territory at all. An additional measure of anxiety is defecation.

10 F2 cross Generation F0 F1 F2

11 anxious about average non-anxious
Suppose we have a quantitative phenotype that is influenced by a genetic variant. For example, say our normally distributed phenotype is affected by a QTL accounting for some small fraction of the variance, 10% or so. If we plotted a histogram of the phenotype, it would look something like this. However, assuming the QTL had two alleles and acted in an additive manner, we would in fact be looking at three populations. non-anxious

12 Quantitative Trait Locus
anxious about average Those individuals homozygous for the low allele. non-anxious

13 Quantitative Trait Locus
anxious about average Those who are homozygous for the high allele. non-anxious

14 Quantitative Trait Locus
anxious about average And those who are heterozygous, and who tend to have an intermediate phenotype. non-anxious

15 Quantitative Trait Locus
anxious about average Here are the three genotypes, and the first step is to encode these some way so they can be correlated with the phenotype. non-anxious QTL snp

16 Linear models Also known as ANOVA ANCOVA regression
multiple regression linear regression To keep it general, we’ll be using linear models. ANOVA, ANCOVA, regression, mulitple regression and linear regression. These are different names that people use for linear models, but of course they’re all the same thing.

17 +1 -1 A simple way is to represent the snp by x and let x equal -1, 0 and +1 for the three genotypes in turn. QTL snp

18 +1 -1 Having done this, it’s natural to model the effect of the genotype on the phenotype as a linear model, where y, representing the phenotype is equal to mu plus ax plus episilon… QTL snp

19 +1 -1 mu defines the mean QTL snp

20 +1 -1 a then describes the average effect of adding or taking away copies of the high allele, and epsilon represents the error and accounts for the fact that animals with the same genotype may still have different phenotypes. QTL snp

21 Hypothesis testing H0: H1:
Of course, the way we go about testing whether there is a significant effect of genotype is by comparing two models: one representing a null hypothesis and the other an alternative hypothesis… the phenotype is influenced by the mean plus some error vs the phenotype is influenced by the mean plus the genotype plus error.

22 Hypothesis testing H0: y ~ 1 H1: y ~ 1 + x
And the way we write those formulas in R is like this. y twiddles 1, y is influenced by the mean, y twiddles 1 plus x.

23 Hypothesis testing H0: y ~ 1 H1: y ~ 1 + x
H1 vs H0 : Does x explain a significant amount of the variation? Our model comparison amounts to asking the question “does x explain a significant amount of the variation” , or Does the larger model fit the data significantly better than the smaller model given the number of parameters required for that improvement. The comparison of the two models yeilds a likelihood ratio, which can be written as a LOD score if you like.

24 Hypothesis testing H0: y ~ 1 H1: y ~ 1 + x
H1 vs H0 : Does x explain a significant amount of the variation? LOD score Our model comparison amounts to asking the question “does x explain a significant amount of the variation” , or Does the larger model fit the data significantly better than the smaller model given the number of parameters required for that improvement. The comparison of the two models yeilds a likelihood ratio, which can be written as a LOD score if you like. likelihood ratio

25 Hypothesis testing H0: y ~ 1 H1: y ~ 1 + x
H1 vs H0 : Does x explain a significant amount of the variation? LOD score Or more meaningfully used in a chi-square test, yeilding a p-value, which we often find convenient to express as a on the log scale as a logP. likelihood ratio Chi Square test p-value logP

26 Hypothesis testing H0: y ~ 1 H1: y ~ 1 + x
H1 vs H0 : Does x explain a significant amount of the variation? LOD score In the special case of linear models we could do a likelihood ratio test, but it turns out to be more accurate to use explained and unexplained sums of squares leading to an F-test, and to get our p-value and logP that way. likelihood ratio Chi Square test p-value logP linear models only SS explained / SS unexplained F-test (or t-test)

27 Hypothesis testing H0: y ~ 1 + x1 H1: y ~ 1 + x1 + x2
H1 vs H0 : Does x2 explain a significant amount of the variation after accounting for x1? or is x2 significant conditional on x1? Our model comparison amounts to asking the question “does x explain a significant amount of the variation” , or Does the larger model fit the data significantly better than the smaller model given the number of parameters required for that improvement. The comparison of the two models yeilds a likelihood ratio, which can be written as a LOD score if you like.

28 F2 cross Generation F0 F1 F2

29 QTL This is the chromosome where the QTL is and all those lines represent every snp on that chromosome. The marked snp is the QTL. 29

30 QTL logP If we perform our hypothesis test at every snp and plot them on a graph we get something like this. Snps that have no influence on the trait have low scores and the QTL snp has the highest score. But what you’ll also notice is that the snps around the QTL also have high scores. 30

31 highly significant snp
QTL logP If we perform our hypothesis test at every snp and plot them on a graph we get something like this. Snps that have no influence on the trait have low scores and the QTL snp has the highest score. But what you’ll also notice is that the snps around the QTL also have high scores. highly significant snp non significant snp 31

32 QTL logP If we perform our hypothesis test at every snp and plot them on a graph we get something like this. Snps that have no influence on the trait have low scores and the QTL snp has the highest score. But what you’ll also notice is that the snps around the QTL also have high scores. 32

33 QTL logP Genotype every animal at snp 1 and compare the 33

34 Chromosome scan for F2 Typical chromosome QTL goodness of fit (logP)
significance threshold 200Mb position along whole chromosome (Mb) Typical chromosome

35 Advanced intercross lines (AILs)
F0 F1 F2 F3 F4 Darvasi & Soller (1995) Genetics

36 F12 cross F0 F1 F2 F12

37 Chromosome scan for F12 Typical chromosome QTL goodness of fit (logP)
significance threshold 200Mb position along whole chromosome (Mb) Typical chromosome

38 Practical Fitting a linear model to test a marker-phenotype association Single marker association on an F2 Permutation test Single marker association on an AIL (F12) Conditional modelling of loci Start Firefox, File->Open and go to F:\valdar\ThursdayAfternoonAnimals\practical.R Start R

39 Practical: F2 cross Bonferroni = 2.6 permutation ~ 2.1
uncorrected = 1.3

40 Practical: F2 cross Bonferroni = 2.6 permutation ~ 2.1
uncorrected = 1.3 generalized extreme value (GEV) distribution

41 Practical: F12 phenotype ~ MARKER

42 Practical: F12 phenotype ~ MARKER phenotype ~ m37 + MARKER

43 Practical: F12 phenotype ~ MARKER phenotype ~ m37 + MARKER

44 Practical: F12 phenotype ~ MARKER phenotype ~ m37 + MARKER

45 Practical: F12 phenotype ~ MARKER phenotype ~ m37 + MARKER
phenotype ~ m37 + m29 + MARKER

46 F2 F18 F18 multilocus approach

47 F2 F18 F18 multilocus approach

48 F2 breeding in a small population F18 F18 multilocus approach

49 F2 population structure gross genetic differences between groups where groups = families F18 multilocus approach

50 F2 population structure gross genetic differences between groups where groups = families multilocus approach

51 Heterogeneous Stocks The title of my talk refers to the mouse version of human association studies. In fact the project I’m going to describe could been seen as something in between an association study and an inbred line cross. The population we’ve used are the heterogeneous stock mice pictured here, also known as the HS. 51

52 Heterogeneous Stock Pseudo-random mating for 50 generations
Avg. Distance Between Recombinations: HS is an outbred population large number of recombinants HS has 8 progenitors greater haplotypic diversity more likely QTLs will segregate HS ~2 cM

53 Heterogeneous Stock F2 Intercross x Pseudo-random mating
for 50 generations F1 Avg. Distance Between Recombinations: HS is an outbred population large number of recombinants HS has 8 progenitors greater haplotypic diversity more likely QTLs will segregate HS ~2 cM F2 intercross ~30 cM F2

54 124 Phenotypes Anxiety [24] Asthma [13] Biochemistry [15]
Bone Morphology [23] Diabetes [16] Haematology [15] Immunology [9] Weight/size related [8] Wound Healing [1] 54 54

55 Intraperitoneal Glucose Tolerance Test
55

56

57

58 How to select peaks: a simulated example
Reallistic example

59 How to select peaks: a simulated example
Simulate 7 x 5% QTLs (ie, 35% genetic effect) + 20% shared environment effect + 45% noise = 100% variance We’ve estimated 20% cage effects

60 Simulated example

61 Say forward selection phenotype ~ ?

62 condition on 1 peak phenotype ~ peak 1 + ?

63 condition on 2 peaks phenotype ~ peak 1 + peak 2 + ?

64 condition on 3 peaks phenotype ~ peak 1 + peak 2 + peak 3 + ?

65 condition on 4 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + ?

66 condition on 5 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + ?

67 condition on 6 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + ?

68 condition on 7 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + ?

69 condition on 8 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + ?

70 condition on 9 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + peak 9 + ?

71 condition on 10 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + peak 9 + peak 10 + ?

72 condition on 11 peaks phenotype ~ peak 1 + peak 2 + peak 3 + peak 4 + peak 5 + peak 6 + peak 7 + peak 8 + peak 9 + peak 10 + peak 11 + ?

73 Peaks chosen by forward selection
We have recorded all the QTLs in this simulation, but some false positives. Also in other simulations we missed some. How can we tell which peaks are genuine? One way to think about this sis suppose we had chosen a slightly different set of HS mice: the peak heights would have been different and our forward selection would have chosen slightly different set of peaks.

74 Bootstrap sampling 1 2 3 4 10 subjects 5 6 7 8 9 10
To deal with this we’ve taken a bootstrapping approach. Let me explain what this is.

75 Bootstrap sampling sample with replacement 1 2 3 4 5 6 7 8 9 10 1 2 3
bootstrap sample from 10 subjects 10 subjects So what I will do is create a bootstrap of the 2000 mice and repeat the forward selection

76

77 Forward selection on a bootstrap sample

78 Forward selection on a bootstrap sample

79 Forward selection on a bootstrap sample

80 Bootstrap evidence mounts up…

81 In 1000 bootstraps… Model Inclusion Probability

82

83

84 854 loci in all phenotypes, 84 diabetes loci

85 854 loci in all phenotypes, 84 diabetes loci

86 Servin B, Stephens M (2007)

87 Bayesian Multiple QTL modelling
Kilpikari R, Sillanpaa MJ (2003) Bayesian analysis of multilocus association in quantitative and qualitative traits. Genet Epidemiol 25: Yi N (2004) A unified Markov chain Monte Carlo framework for mapping multiple quantitative trait loci. Genetics 167: Servin B, Stephens M (2007) Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet 3: e114 Fridley BL (2008) Bayesian variable and model selection methods for genetic association studies. Genet Epidemiol.

88 The Collaborative Cross
Heterogeneous Stocks (HS) Collaborative Cross outbreed outbred population recombinant inbred lines Churchill et al 2004; Broman 2005; Valdar et al 2006


Download ppt "Complex Trait Genetics in Animal Models"

Similar presentations


Ads by Google