Presentation on theme: "QTL Analysis: Concept Parents F1 F2 F2:3 × A B Generation Procedure"— Presentation transcript:
1 QTL Analysis: Concept Parents F1 F2 F2:3 × A B Generation Procedure Alternatives: BC1, RIL, DHLFieldPHT[cm]210190203159206.171Marker# M1 B B H H A .. A2 H A H A A .. H3 B B H H H .. A4 H H B B B .. H5 H B H H A .. BN A H H H A .. ALaboratoryChromosome 1LOD scorePHTOfficeQTL mapping is a multi step procedure that involves field and lab work as well as an elaborate statistical analysis.In general, two homozygous lines that differ significantly for the trait under study are crossed. The F1 hybrid is selfed to produce a segregating F2 populations.F2 individuals will be genotyped using molecular markers.F2 will be selfed to produce F2:3 lines. enough seed for repeated field trials.QTL mapping idea (Lynch and Walsh)By crossing two lines, linkage disequilibrium is created between loci that differ between the parental lines. This is creating associations between marker loci and linked segregating QTL.Experimental designsF2:3 in contrast to all other populations here three marker classes can be observed, therefore, dominance can be evaluated.AIL advanced intercross lines, Random mated populations, higher resolution, but decreased power of QTL detection.RIL homozygous genetic background, field trials can be repeated in multiple locations and years.DHLBCMarker information and phenotypic data is combined and statistical tools are used to map and characterize QTL.
2 QTL Analysis: Single Marker Analysis 240Total196umc157AAAaaa195197F = 0.48 nsumc130AAAaaa201196191F = 6.47**220Plant height (cm)200180160A basic method to identify markers linked to QTL is the single marker analysis.XMC (cm)
3 QTL Analysis: Single Marker Model (F2) QQ Qq qqMM (1-r) r(1-r) r μ(MM)Mm r(1-r) (1-r)2+r2 r(1-r) μ(Mm)mm r r(1-r) (1-r)2 μ(mm)μ μ μ3rMmQqAdditive effect:Dominance effect:Expected QTL genotypic frequencies conditional on marker genotype.The QTL mean for each marker genotype is equl to the frequency of each QTL type time the value of each QTL type, given the marker genotype.Example:uMM = u1 [(1-r)2] + u2 [2r(1-r)] +u3 [r2]We have three equations but four parameters (u1 – u4, r). QTL effects and position of the QTL are confounded. We can only solve for the QTL effects if r is fixed.F tests on the contrasts of marker classes test the following hypothesis:a > 0 d > 0 r < 0.5Schön, 2002
4 QTL Analysis: Single Marker Model (F2) Example: Plant height, umc130X(MM) = 201cm X(Mm) = 196cm X(mm) = 191cmCase 1Case 2r = 0MmQqr = 0.2MmQqIn single marker analysis, the only information we have are the means of each marker class. And based on this information it is possible to determine whether a marker is linked with a QTL. However, it is not possible to determine the effect of a QTL, because effect and QTL position are confounded.PHT (cm) r = 0 r = r = 0.4Add. EffectX(QQ) X(Qq) X(qq)
6 Dissecting A Quantitative Trait: Time Versus Resolution 5NILsRI QTLMappingPositionalCloningResearch Time in YearsF2 QTLMappingAssociations111x1041x107Resolution in bp
7 Resolution Versus Allelic Range >40Associations InDiverse GermplasmPedigreeAlleles EvaluatedAssociations InNarrow GermplasmF2 or RILMappingPositionalCloningNIL111x1041x107Resolution in bp
8 Evaluate whether nucleotide polymorphisms associate with phenotype Natural populationsExploit extensive recombinationAssociation TestsTAGC1.3m1.5m1.4m1.8m2.0mSpend a lot of time on this slide
9 Association mapping Mainstay of human genetics Cystic fibrosis One of a few possible approachesReproducibility was an issueCystic fibrosisKerem, et al. (1989). Science 245,Alzheimer's diseaseCorder et al. (1994). Nature Genet. 7,
10 Associations may result from at least three causes 1. The locus is the cause of the phenotype2. The locus is in linkage disequilibrium with the cause of the phenotypeLinked and highly correlated
11 Complete Linkage Disequilibrium 12D’=1r2=16Locus 1Locus 2Same mutational history and no recombination.No resolutionAdapted from Rafalski (2002) CurrOpin Plant Biol 5:
12 Linkage Disequilibrium 12D’=1r2=0.3336Locus 1Locus 2Different mutational history and no recombination.Some resolution
13 Linkage Equilibrium D’=0 r2=0 1 2 3 Locus 1 Locus 2 Same mutational history with recombination.Resolution
14 3. Population structure can produce associations GTAndesU.S.P<<0.001TGP=0.04GTThese non-functional associations can be accounted for by estimating the population structure using random markers.
16 QTL Analysis: Interval Mapping Simple Interval MappingComposite Interval MappingPLOT Peak at 96LOD ==== =====I === ===I == ===I ==I ===I ====II ===============********** ****** ***************0.0 M M+----MC--M+----M C M M cM(0.47)This problem is solved by using interval mapping approaches.PlabQTL
17 QTL Analysis: Power of QTL detection 102030405060708090100Heritability0.40.50.60.220.127.116.11N = 600N = 300N = 100Power: Probability of finding a QTLHeritability:Simulation ModelAdditive genetic modelF2 or F3 linesMaize genetic model, marker interval 20cM16500 F2 F3 individuals that were partitioned into the different populationUtz, H.F., and A.E. Melchinger Comparison of different approaches to interval mapping of quantitative trait loci. pp ed. J.W. van Ooijen, and J. Jansen. Biometrics in Plant Breeding Applications of Molecular Markers. Proceedings of the Ninth Meeting of the EUCARPIA section Biometrics in Plant Breeding. Wageningen. CPRO-DLO, Wageningen, the Netherlands.ns.Utz and Melchinger, 1994
18 QTL Analysis: Conclusions There are a number of QTL, in analysis the largest ones easiest to detect BUTMakes detection of others difficultModels can adjust for this – detect others
19 QTL Analysis: Conclusions QTL mapping combines qualitative linkage analysis with quantitative genetic analysis. – Association between marker genotypes and phenotypic trait values.Single marker analysis is easy to perform but QTL effect and position are confounded. This results in low power of QTL detection.Interval mapping approaches increase power of QTL detection and allow the estimation of QTL effects and position.
20 QTL Analysis: Conclusions Estimates of QTL effects and the proportion of the genotypic variance explained by QTL are biased due to genotypic and environmental sampling.Estimates of QTL position show low precision.With large populations a large number of QTL is found for complex traits.When conducting a QTL study you may wish to use a large population size.
25 Properties of LD The basic measure of LD is: DAB = PAB - pA pB A a ( DAB = -DAb = -DaB = Dab )AaPAB =pApB + DABPAb =pApb - DABPaB =papB - DABPab =papb + DABBbpApapBpb125
26 Linkage Disequilibrium versus Generations Since its Creation 1002003004005000.10.20.30.18.104.22.168.80.91c = 0.1c = 0.02c = 0.01c = 0.005c = 0.001Disequilibrium,rABGeneration, grAB (1-c)gRecomb. Rate (c)
27 Other Measures of LD E(r2)= 1 / (1 – 4Nc) Can divide DAB by the maximum value it can obtain:D’AB = DAB / [max(-pApB, -papb)] if DAB < 0DAB / [min (pApb, papB)] if DAB > 0The sampling properties of D’AB are not well understood.r2AB =D2ABpA pB pa pbE(r2)= 1 / (1 – 4Nc)
28 LD generally decays rapidly with distance Remington, D. L., et al PNAS-USA 98: & unpublishedr2
29 Population Effect on Linkage Disequilibrium in Maize InvestigatorPopulation StudiedExtent of LDGautLandraces<1000 bpBucklerDiverse Inbreds2000 bpRafalskiElite Lines100 kb?(6 kb euchromatin?)Reviewed in Flint-Garcia, S. A. et al Annual Review of Plant Biology 54:
32 Population Stratification: American Indian and Diabetes Knowler 1988 Am J Hum Genet 43,
33 Use SSR Markers to Estimate Population Structure Method: Pritchard, J. K., M. Stephens, and P. Donnelly Inference of population structure using multilocus genotype data. Genetics 155:Example: Remington, D. L., et al Proc Natl Acad Sci U S A 98:8 Stiff Stalk38 Non-Stiff Stalk30 Sub-Tropical
34 Logistic Regression Ratio Test For Association Adapted from Pritchard case-control approachWhere:C = candidate polymorphism distributionT = trait valueQ = matrix of population membershipEvaluated by logistic regressionSignificance evaluated by permutation based on haplotype distribution in populationsPritchard, J. K., M. Stephens, N. A. Rosenberg, and P. Donnelly Am J Hum Genet 67:
35 Population Structure Estimates Greatly Reduce Estimated Type I Error Rates FieldsFlowering TimeHeight
36 Su1 Sugary1 is an isoamylase, a starch debranching enzyme Sequenced fully from 32 diverse linesSampled 2 small parts of gene from 102 linesWhitt, S. R., et al PNAS-USA 99:11100bp
37 su1 Promoter & 1st Exon Two distinct alleles Sweet phenotype not associated
38 su1 Coding Region Two distinct alleles Sweet phenotype associated with W578R
40 Dwarf8 functional variation 2 Amino AcidDeletionMITEIndelSH2 DomainDays to Silking relative to B73When controlling for population structure, associates with flowering time & plant height across 12 environments.Thornsberry et al. 2001Nat. Genet.
42 Statistics - Hypothesis Test Null Hypoth TrueNull Hypoth FalseReject Null HypothesisType I ErrorαCorrectFail to Reject Null HypothesisType II ErrorβP-value = αPower = 1- β
43 Experimentwise P value Each statistical test has a Type I error rateTest 20 independent SNPs, one will be significant at P<0.05Bonferroni correction essentially divides the P by number of testsOften too conservative (no power), as markers are correlatedChurchill and Doerge permutation help estimate experimentwise P,Permutes the entire genotype relative to the phenotypes
44 Power of approaches Sample size Heritability of trait 100 to 1000 are typicalHeritability of traitH2 = 10% - 90%Depends on ability to measure traitInteractions with environmentDepends on statistical properties of test
45 Association Approaches Complement QTL Linkage Mapping Linkage (RILs)Association10,000,000 bp2000 bpResolutionHigh PowerLittle PowerGenome ScanLow (1 or 2)High (10s)Allelic RangeHighLowStatistical Power per Allele