Presentation on theme: "QTL Analysis: Concept Parents F1 F2 F2:3 × A B Generation Procedure"— Presentation transcript:
1QTL Analysis: Concept Parents F1 F2 F2:3 × A B Generation Procedure Alternatives: BC1, RIL, DHLFieldPHT[cm]210190203159206.171Marker# M1 B B H H A .. A2 H A H A A .. H3 B B H H H .. A4 H H B B B .. H5 H B H H A .. BN A H H H A .. ALaboratoryChromosome 1LOD scorePHTOfficeQTL mapping is a multi step procedure that involves field and lab work as well as an elaborate statistical analysis.In general, two homozygous lines that differ significantly for the trait under study are crossed. The F1 hybrid is selfed to produce a segregating F2 populations.F2 individuals will be genotyped using molecular markers.F2 will be selfed to produce F2:3 lines. enough seed for repeated field trials.QTL mapping idea (Lynch and Walsh)By crossing two lines, linkage disequilibrium is created between loci that differ between the parental lines. This is creating associations between marker loci and linked segregating QTL.Experimental designsF2:3 in contrast to all other populations here three marker classes can be observed, therefore, dominance can be evaluated.AIL advanced intercross lines, Random mated populations, higher resolution, but decreased power of QTL detection.RIL homozygous genetic background, field trials can be repeated in multiple locations and years.DHLBCMarker information and phenotypic data is combined and statistical tools are used to map and characterize QTL.
2QTL Analysis: Single Marker Analysis 240Total196umc157AAAaaa195197F = 0.48 nsumc130AAAaaa201196191F = 6.47**220Plant height (cm)200180160A basic method to identify markers linked to QTL is the single marker analysis.XMC (cm)
3QTL Analysis: Single Marker Model (F2) QQ Qq qqMM (1-r) r(1-r) r μ(MM)Mm r(1-r) (1-r)2+r2 r(1-r) μ(Mm)mm r r(1-r) (1-r)2 μ(mm)μ μ μ3rMmQqAdditive effect:Dominance effect:Expected QTL genotypic frequencies conditional on marker genotype.The QTL mean for each marker genotype is equl to the frequency of each QTL type time the value of each QTL type, given the marker genotype.Example:uMM = u1 [(1-r)2] + u2 [2r(1-r)] +u3 [r2]We have three equations but four parameters (u1 – u4, r). QTL effects and position of the QTL are confounded. We can only solve for the QTL effects if r is fixed.F tests on the contrasts of marker classes test the following hypothesis:a > 0 d > 0 r < 0.5Schön, 2002
4QTL Analysis: Single Marker Model (F2) Example: Plant height, umc130X(MM) = 201cm X(Mm) = 196cm X(mm) = 191cmCase 1Case 2r = 0MmQqr = 0.2MmQqIn single marker analysis, the only information we have are the means of each marker class. And based on this information it is possible to determine whether a marker is linked with a QTL. However, it is not possible to determine the effect of a QTL, because effect and QTL position are confounded.PHT (cm) r = 0 r = r = 0.4Add. EffectX(QQ) X(Qq) X(qq)
6Dissecting A Quantitative Trait: Time Versus Resolution 5NILsRI QTLMappingPositionalCloningResearch Time in YearsF2 QTLMappingAssociations111x1041x107Resolution in bp
7Resolution Versus Allelic Range >40Associations InDiverse GermplasmPedigreeAlleles EvaluatedAssociations InNarrow GermplasmF2 or RILMappingPositionalCloningNIL111x1041x107Resolution in bp
8Evaluate whether nucleotide polymorphisms associate with phenotype Natural populationsExploit extensive recombinationAssociation TestsTAGC1.3m1.5m1.4m1.8m2.0mSpend a lot of time on this slide
9Association mapping Mainstay of human genetics Cystic fibrosis One of a few possible approachesReproducibility was an issueCystic fibrosisKerem, et al. (1989). Science 245,Alzheimer's diseaseCorder et al. (1994). Nature Genet. 7,
10Associations may result from at least three causes 1. The locus is the cause of the phenotype2. The locus is in linkage disequilibrium with the cause of the phenotypeLinked and highly correlated
11Complete Linkage Disequilibrium 12D’=1r2=16Locus 1Locus 2Same mutational history and no recombination.No resolutionAdapted from Rafalski (2002) CurrOpin Plant Biol 5:
12Linkage Disequilibrium 12D’=1r2=0.3336Locus 1Locus 2Different mutational history and no recombination.Some resolution
13Linkage Equilibrium D’=0 r2=0 1 2 3 Locus 1 Locus 2 Same mutational history with recombination.Resolution
143. Population structure can produce associations GTAndesU.S.P<<0.001TGP=0.04GTThese non-functional associations can be accounted for by estimating the population structure using random markers.
16QTL Analysis: Interval Mapping Simple Interval MappingComposite Interval MappingPLOT Peak at 96LOD ==== =====I === ===I == ===I ==I ===I ====II ===============********** ****** ***************0.0 M M+----MC--M+----M C M M cM(0.47)This problem is solved by using interval mapping approaches.PlabQTL
17QTL Analysis: Power of QTL detection 102030405060708090100Heritability0.40.50.60.184.108.40.206N = 600N = 300N = 100Power: Probability of finding a QTLHeritability:Simulation ModelAdditive genetic modelF2 or F3 linesMaize genetic model, marker interval 20cM16500 F2 F3 individuals that were partitioned into the different populationUtz, H.F., and A.E. Melchinger Comparison of different approaches to interval mapping of quantitative trait loci. pp ed. J.W. van Ooijen, and J. Jansen. Biometrics in Plant Breeding Applications of Molecular Markers. Proceedings of the Ninth Meeting of the EUCARPIA section Biometrics in Plant Breeding. Wageningen. CPRO-DLO, Wageningen, the Netherlands.ns.Utz and Melchinger, 1994
18QTL Analysis: Conclusions There are a number of QTL, in analysis the largest ones easiest to detect BUTMakes detection of others difficultModels can adjust for this – detect others
19QTL Analysis: Conclusions QTL mapping combines qualitative linkage analysis with quantitative genetic analysis. – Association between marker genotypes and phenotypic trait values.Single marker analysis is easy to perform but QTL effect and position are confounded. This results in low power of QTL detection.Interval mapping approaches increase power of QTL detection and allow the estimation of QTL effects and position.
20QTL Analysis: Conclusions Estimates of QTL effects and the proportion of the genotypic variance explained by QTL are biased due to genotypic and environmental sampling.Estimates of QTL position show low precision.With large populations a large number of QTL is found for complex traits.When conducting a QTL study you may wish to use a large population size.
25Properties of LD The basic measure of LD is: DAB = PAB - pA pB A a ( DAB = -DAb = -DaB = Dab )AaPAB =pApB + DABPAb =pApb - DABPaB =papB - DABPab =papb + DABBbpApapBpb125
26Linkage Disequilibrium versus Generations Since its Creation 1002003004005000.10.20.30.220.127.116.11.80.91c = 0.1c = 0.02c = 0.01c = 0.005c = 0.001Disequilibrium,rABGeneration, grAB (1-c)gRecomb. Rate (c)
27Other Measures of LD E(r2)= 1 / (1 – 4Nc) Can divide DAB by the maximum value it can obtain:D’AB = DAB / [max(-pApB, -papb)] if DAB < 0DAB / [min (pApb, papB)] if DAB > 0The sampling properties of D’AB are not well understood.r2AB =D2ABpA pB pa pbE(r2)= 1 / (1 – 4Nc)
28LD generally decays rapidly with distance Remington, D. L., et al PNAS-USA 98: & unpublishedr2
29Population Effect on Linkage Disequilibrium in Maize InvestigatorPopulation StudiedExtent of LDGautLandraces<1000 bpBucklerDiverse Inbreds2000 bpRafalskiElite Lines100 kb?(6 kb euchromatin?)Reviewed in Flint-Garcia, S. A. et al Annual Review of Plant Biology 54:
32Population Stratification: American Indian and Diabetes Knowler 1988 Am J Hum Genet 43,
33Use SSR Markers to Estimate Population Structure Method: Pritchard, J. K., M. Stephens, and P. Donnelly Inference of population structure using multilocus genotype data. Genetics 155:Example: Remington, D. L., et al Proc Natl Acad Sci U S A 98:8 Stiff Stalk38 Non-Stiff Stalk30 Sub-Tropical
34Logistic Regression Ratio Test For Association Adapted from Pritchard case-control approachWhere:C = candidate polymorphism distributionT = trait valueQ = matrix of population membershipEvaluated by logistic regressionSignificance evaluated by permutation based on haplotype distribution in populationsPritchard, J. K., M. Stephens, N. A. Rosenberg, and P. Donnelly Am J Hum Genet 67:
35Population Structure Estimates Greatly Reduce Estimated Type I Error Rates FieldsFlowering TimeHeight
36Su1 Sugary1 is an isoamylase, a starch debranching enzyme Sequenced fully from 32 diverse linesSampled 2 small parts of gene from 102 linesWhitt, S. R., et al PNAS-USA 99:11100bp
37su1 Promoter & 1st Exon Two distinct alleles Sweet phenotype not associated
38su1 Coding Region Two distinct alleles Sweet phenotype associated with W578R
40Dwarf8 functional variation 2 Amino AcidDeletionMITEIndelSH2 DomainDays to Silking relative to B73When controlling for population structure, associates with flowering time & plant height across 12 environments.Thornsberry et al. 2001Nat. Genet.
42Statistics - Hypothesis Test Null Hypoth TrueNull Hypoth FalseReject Null HypothesisType I ErrorαCorrectFail to Reject Null HypothesisType II ErrorβP-value = αPower = 1- β
43Experimentwise P value Each statistical test has a Type I error rateTest 20 independent SNPs, one will be significant at P<0.05Bonferroni correction essentially divides the P by number of testsOften too conservative (no power), as markers are correlatedChurchill and Doerge permutation help estimate experimentwise P,Permutes the entire genotype relative to the phenotypes
44Power of approaches Sample size Heritability of trait 100 to 1000 are typicalHeritability of traitH2 = 10% - 90%Depends on ability to measure traitInteractions with environmentDepends on statistical properties of test
45Association Approaches Complement QTL Linkage Mapping Linkage (RILs)Association10,000,000 bp2000 bpResolutionHigh PowerLittle PowerGenome ScanLow (1 or 2)High (10s)Allelic RangeHighLowStatistical Power per Allele