Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced.

Slides:



Advertisements
Similar presentations
Planning breeding programs for impact
Advertisements

Genetic Heterogeneity Taken from: Advanced Topics in Linkage Analysis. Ch. 27 Presented by: Natalie Aizenberg Assaf Chen.
Hypothesis testing Another judgment method of sampling data.
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Plausible values and Plausibility Range 1. Prevalence of FSWs in some west African Countries 2 0.1% 4.3%
DATA ANALYSIS Module Code: CA660 Lecture Block 5.
Basics of Linkage Analysis
QTL Mapping R. M. Sundaram.
AP Statistics – Chapter 9 Test Review
1 QTL mapping in mice Lecture 10, Statistics 246 February 24, 2004.
Lecture 9: QTL Mapping I:
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
CS 8751 ML & KDDEvaluating Hypotheses1 Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal.
9-1 Hypothesis Testing Statistical Hypotheses Statistical hypothesis testing and confidence interval estimation of parameters are the fundamental.
Evaluating Hypotheses
DATA ANALYSIS Module Code: CA660 Lecture Block 5.
Chapter Sampling Distributions and Hypothesis Testing.
Inferences About Process Quality
Chapter 9 Hypothesis Testing.
BCOR 1020 Business Statistics
Lecture 5: Segregation Analysis I Date: 9/10/02  Counting number of genotypes, mating types  Segregation analysis: dominant, codominant, estimating segregation.
The Argument for Using Statistics Weighing the Evidence Statistical Inference: An Overview Applying Statistical Inference: An Example Going Beyond Testing.
Chapter 9 Hypothesis Testing II: two samples Test of significance for sample means (large samples) The difference between “statistical significance” and.
Genetic Mapping Oregon Wolfe Barley Map (Szucs et al., The Plant Genome 2, )
9-1 Hypothesis Testing Statistical Hypotheses Definition Statistical hypothesis testing and confidence interval estimation of parameters are.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
1 Lecture 19: Hypothesis Tests Devore, Ch Topics I.Statistical Hypotheses (pl!) –Null and Alternative Hypotheses –Testing statistics and rejection.
Class 3 1. Construction of genetic maps 2. Single marker QTL analysis 3. QTL cartographer.
Experimental Design and Data Structure Supplement to Lecture 8 Fall
Quantitative Genetics. Continuous phenotypic variation within populations- not discrete characters Phenotypic variation due to both genetic and environmental.
Complex Traits Most neurobehavioral traits are complex Multifactorial
Type 1 Error and Power Calculation for Association Analysis Pak Sham & Shaun Purcell Advanced Workshop Boulder, CO, 2005.
Quantitative Genetics
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.1.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
Section 3.3: The Story of Statistical Inference Section 4.1: Testing Where a Proportion Is.
Lecture 12: Linkage Analysis V Date: 10/03/02  Least squares  An EM algorithm  Simulated distribution  Marker coverage and density.
Lecture 3: Statistics Review I Date: 9/3/02  Distributions  Likelihood  Hypothesis tests.
Chapter5: Evaluating Hypothesis. 개요 개요 Evaluating the accuracy of hypotheses is fundamental to ML. - to decide whether to use this hypothesis - integral.
Association between genotype and phenotype
AP Statistics Section 11.1 B More on Significance Tests.
© Copyright McGraw-Hill 2004
Statistical Inference Statistical inference is concerned with the use of sample data to make inferences about unknown population parameters. For example,
Lecture 22: Quantitative Traits II
Lecture 23: Quantitative Traits III Date: 11/12/02  Single locus backcross regression  Single locus backcross likelihood  F2 – regression, likelihood,
Sampling Distribution (a.k.a. “Distribution of Sample Outcomes”) – Based on the laws of probability – “OUTCOMES” = proportions, means, test statistics.
Why you should know about experimental crosses. To save you from embarrassment.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
BIOL 582 Lecture Set 2 Inferential Statistics, Hypotheses, and Resampling.
High resolution QTL mapping in genotypically selected samples from experimental crosses Selective mapping (Fig. 1) is an experimental design strategy for.
Lecture 11: Linkage Analysis IV Date: 10/01/02  linkage grouping  locus ordering  confidence in locus ordering.
Hypothesis Testing and Statistical Significance
Statistical Inference for the Mean Objectives: (Chapter 8&9, DeCoursey) -To understand the terms variance and standard error of a sample mean, Null Hypothesis,
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
CHAPTER 10 Comparing Two Populations or Groups
Genome Wide Association Studies using SNP
CHAPTER 10 Comparing Two Populations or Groups
Relationship between quantitative trait inheritance and
CHAPTER 10 Comparing Two Populations or Groups
Lecture 9: QTL Mapping II: Outbred Populations
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Presentation transcript:

Population structure at QTL d A B C D E Q F G H a b c d e q f g h The population content at a quantitative trait locus (backcross, RIL, DH). Can be deduced by observation of marker groups. In the figure, the observation is for a marker coinciding with QTL.

     m 1 m 2 … m 3 … m i … m j … m k The simplest genetic model: single marker analysis Dihaploid mapping population, two homozygotes at each locus For marker m i coinciding with Q/q X mm =  - ½ d X MM =  + ½ d X MM - X mm = d Q qQ  qM mM  mQ qQ  qM mM  m d x f(x)f(x) qq QQ  f mm (x)=(1-r) f qq (x) + r f QQ (x) f MM (x)=r f qq (x) +(1-r) f QQ (x) X  - ½d)  + ½d) X mm = (1-r)(  - ½d) + r(  + ½d) X  - ½d)  + ½d) X MM = r(  - ½d) +(1-r)(  + ½d) X MM - X mm = d X MM - X mm = (1-2r) d For marker m j apart from Q/q 

QTL Interval Mapping – DH (F 1 =M 1 QM 2 /m 1 qm 2  meiosis  DH Expected distributions of the trait in the 4 marker groups look like f M 1 M 2 = [(1-r 1 ) (1-r 2 )f QQ + r 1 r 2 f qq ]/(1-r) f M 1 m 2 = [(1-r 1 )r 2 f QQ + r 1 (1-r 2 ) f qq ]/r f m 1 M 2 = [r 1 (1-r 2 )f QQ + r 2 (1-r 1 ) f qq ]/r f m 1 m 2 = [r 1 r 2 f QQ + (1-r 1 ) (1-r 2 ) f qq ]/(1-r)

ML-estimation in QTL interval analysis L (r, m, d,  )=     f i (r, m, d,  x ij ) = L (  | data) max 4 N i i=1 j=1 ML-estimates of QTL parameters:  *={ r*, m*, d*,  The model of QTL effect For additive QTL effect: x = m + dg q +  where g q = -1 for qq, and +1 for QQ; E  = . The QTL-effect is d=(  QQ -  qq )/2,  qq =m-d,  QQ =m+d d x f(x)f(x) qq QQ  i=1,…,4 M 1 M 2 M 1 m 2 m 1 M 2 m 1 m 2

ML analysis L (m,  ) =  f (m,  x j ) N j=1j=1 m* = x =  x j / N, max lnL (m,  ) =  ln f (m,  x j ) N j=1   =  ( x j – x)  1 N - 1 to correct the bias m  radius of convergence, i.e. we need a good initial point 00 ** 00

What do one expect from the analytical tools ? To extract maximum mapping information from the experimental data The main questions in QTL analysis:  High QTL detection power (detect QTL when it exists)  Minimum “false positive” (high significance)  Mapping resolution (e.g., two vs. one QTL in a region)  Accuracy of parameter estimates (e.g., d ± m d )  Discrimination between alternative models of the trait “genetic architecture” (e.g., additive vs. heterotic)  analytical tools 

Lod Scores for marker mapping L(θ< 0.5) L(θ <0.5 | data) Z= LOD score (θ) = log 10 = L(θ= 0.5) L(θ =0.5 | data) Logarithm of the Odds 2 ln [L(  ) / L(0.5)] = 4.6  Z ~  2 1 Now we use LOD score to compare another pair of alternatives: whether or not the interval of interest carries a QTL affecting our target trait, i.e. H 1 (d  0) versus H 0 (d=0), where d (effect) is one of parameters of our vector θ=(r, m, d, . In other words, we are about to check whether there is a connection between the trait values and the marked interval (or chromosome), i.e., whether the effect of the chromosome is significant. Here LOD was to compare 2 alternative hypotheses about linkage: H 1 (θ< 0.5) versus H 0 (θ=0.5) 

Lod Score and Testing Significance If there is such connection (i.e., H 1 is true), we are supposed to get larger values of LOD compared to situations of no connection (i.e., when H 0 is true).            m 1 m 2 … m 3 … m i … m j … m k …We are about to check whether there is a connection between the trait values and the marked interval (or chromosome). Let us take the data and calculate the LOD for each interval of the chromosome. How to decide whether the obtained level of LOD is indicative for H 1 or H 0 ? The answer can be reached by building artificially the situation of H 0, i.e. when there is no connection of the trait values and markers. By permutation test Reshuffling markers and trait values How ?

Reshuffling trait values relative to markers Markers Genotypes M … M … ……..…………………….………... M … ……..……………………… M … Traits tr tr … …………………………………………………………………. d A B C D E Q F G H a b c d e q f g h

Testing Significance The algorithm may look like this: Calculate max LOD value (LOD=LOD*) for the chromosome. Reshuffle trait values relative to markers  build a sample that fits H 0 For each reshuffled sample # i calculate LOD=LOD i. Repeat the last two steps N times (N runs). If H 1 is correct, then for the predominate majority of reshuffled samples, LOD i <<LOD*. The proportion of runs, , with LOD i  LOD* is called significance. It means the probability to declare a QTL that does not exist (indeed, reshuffling destroys any connection between the chromosome and the trait, thus cases LOD i  LOD* are “false positive”). The higher the LOD* the lower the chance of LOD i  LOD*.

Calculating the QTL detection power For the declared QTL, the significance is a measure of a “false positive” risk (i.e. declaring an effect that does not exist). We would like also to know the risk of “false negative”, i.e. the probability  of not detecting the effect that does exist. Of course,  will depend on the chosen level of . The score 1-  (probability to detect a QTL that does exist is called “detection power”. How to calculate it ? by Bootstrap analysis It can be conducted only after calculating the threshold values of LOD under H 0

Calculating the threshold values of LOD under H 0 The proportion of randomized runs, , with LOD i  LOD* is called significance. It means the probability to declare a QTL that does not exist (cases LOD i  LOD* are “false positive”). We need also to know the value of the LOD that will be over in just 5% (or 1%) of the runs. 99% 95% LOD values under H 0 (in runs)

The algorithm looks like this: Take a series of re-sampling steps, with returning Conduct interval analysis Repeat this procedure for many (e.g., N=1000) such samples Check the proportion of samples where maxLOD exceeds the threshold (= QTL detection power = 1-  ) Calculate the confidence intervals of the parameters Calculating the QTL detection power The score 1-  (probability to detect a QTL that does exist is called “detection power”. How to calculate it ? - by Bootstrap analysis after getting the thresholds of LOD under H 0