PBG 650 Advanced Plant Breeding Module 9: Best Linear Unbiased Prediction – Purelines – Single-crosses.

Slides:



Advertisements
Similar presentations
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Advertisements

General Linear Model Introduction to ANOVA.
PBG 650 Advanced Plant Breeding
1 Regression Models & Loss Reserve Variability Prakash Narayan Ph.D., ACAS 2001 Casualty Loss Reserve Seminar.
Linear regression models
Psychology 202b Advanced Psychological Statistics, II February 10, 2011.
LINEAR REGRESSION: Evaluating Regression Models. Overview Standard Error of the Estimate Goodness of Fit Coefficient of Determination Regression Coefficients.
Quantitative Genetics
Mixed models Various types of models and their relation
Chapter 15 Panel Data Analysis.
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Basics of regression analysis
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Maximum likelihood (ML)
Variance and covariance Sums of squares General linear models.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
PBG 650 Advanced Plant Breeding
PBG 650 Advanced Plant Breeding Module 5: Quantitative Genetics – Genetic variance: additive and dominance.
Correlation and Regression
Extension of Bayesian procedures to integrate and to blend multiple external information into genetic evaluations J. Vandenplas 1,2, N. Gengler 1 1 University.
Objectives of Multiple Regression
Introduction to Linear Regression and Correlation Analysis
Module 7: Estimating Genetic Variances – Why estimate genetic variances? – Single factor mating designs PBG 650 Advanced Plant Breeding.
Some matrix stuff.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Module 8: Estimating Genetic Variances Nested design GCA, SCA Diallel
MTH 161: Introduction To Statistics
Multiple Regression The Basics. Multiple Regression (MR) Predicting one DV from a set of predictors, the DV should be interval/ratio or at least assumed.
Matrix Algebra and Regression a matrix is a rectangular array of elements m=#rows, n=#columns  m x n a single value is called a ‘scalar’ a single row.
PBG 650 Advanced Plant Breeding
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
GG 313 Geological Data Analysis Lecture 13 Solution of Simultaneous Equations October 4, 2005.
The Completely Randomized Design (§8.3)
Chapter 13 Multiple Regression
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Regression-Based Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
Lecture 24: Quantitative Traits IV Date: 11/14/02  Sources of genetic variation additive dominance epistatic.
Lecture 21: Quantitative Traits I Date: 11/05/02  Review: covariance, regression, etc  Introduction to quantitative genetics.
Special Topic: Matrix Algebra and the ANOVA Matrix properties Types of matrices Matrix operations Matrix algebra in Excel Regression using matrices ANOVA.
CpSc 881: Machine Learning
PBG 650 Advanced Plant Breeding
Powerful Regression-based Quantitative Trait Linkage Analysis of General Pedigrees Pak Sham, Shaun Purcell, Stacey Cherny, Gonçalo Abecasis.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
1 G Lect 4W Multiple regression in matrix terms Exploring Regression Examples G Multiple Regression Week 4 (Wednesday)
Université d’Ottawa / University of Ottawa 2001 Bio 8100s Applied Multivariate Biostatistics L11.1 Lecture 11: Canonical correlation analysis (CANCOR)
The “Big Picture” (from Heath 1995). Simple Linear Regression.
Regression Analysis Part A Basic Linear Regression Analysis and Estimation of Parameters Read Chapters 3, 4 and 5 of Forecasting and Time Series, An Applied.
Colorado Center for Astrodynamics Research The University of Colorado 1 STATISTICAL ORBIT DETERMINATION Statistical Interpretation of Least Squares ASEN.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Breeding Value Estimation Chapter 7. Single information source What is the breeding value of this cow for milk production? A cow produces 9000 kg milk.
Use Inverse Matrices to Solve Linear Systems
Regression Models for Linkage: Merlin Regress
PBG 650 Advanced Plant Breeding
Regression.
Evgeniya Anatolievna Kolomak, Professor
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
The regression model in matrix form
OVERVIEW OF LINEAR MODELS
What are BLUP? and why they are useful?
6.5 Taylor Series Linearization
5.2 Least-Squares Fit to a Straight Line
5.4 General Linear Least-Squares
OVERVIEW OF LINEAR MODELS
The F2 Generation  1. F2 Population Mean and Variance (p = q = 0.5) 
The loss function, the normal equation,
Mathematical Foundations of BME Reza Shadmehr
The Basic Genetic Model
Presentation transcript:

PBG 650 Advanced Plant Breeding Module 9: Best Linear Unbiased Prediction – Purelines – Single-crosses

Best Linear Unbiased Prediction (BLUP) Allows comparison of material from different populations evaluated in different environments Makes use of all performance data available for each genotype, and accounts for the fact that some genotypes have been more extensively tested than others Makes use of information about relatives in pedigree breeding systems Provides estimates of genetic variances from existing data in a breeding program without the use of mating designs Bernardo, Chapt. 11

BLUP History Initially developed by C.R. Henderson in the 1940’s Most extensively used in animal breeding Used in crop improvement since the 1990’s, particularly in forestry BLUP is a general term that refers to two procedures –true BLUP – the ‘P’ refers to prediction in random effects models (where there is a covariance structure) –BLUE – the ‘E’ refers to estimation in fixed effect models (no covariance structure)

B-L-U “Best” means having minimum variance “Linear” means that the predictions or estimates are linear functions of the observations Unbiased –expected value of estimates = their true value –predictions have an expected value of zero (because genetic effects have a mean of zero)

Regression in matrix notation Y = X  + ε b = (X’X) -1 X’Y Linear model Parameter estimates SourcedfSSMS Regressionpb’X’YMS R Residualn-pY’Y - b’X’YMS E TotalnY’Y

BLUP Mixed Model in Matrix Notation Fixed effects are constants –overall mean –environmental effects (mean across trials) Random effects have a covariance structure –breeding values –dominance deviations –testcross effects –general and specific combining ability effects Y = X  + Zu + e Design matrices Random effectsFixed effects Classification for the purposes of BLUP

BLUP for purelines – barley example Bernardo, pg 269 Parameters to be estimated means for two sets of environments – fixed effects –we are interested in knowing effects of these particular sets of environments breeding values of four cultivars – random effects – from the same breeding population – there is a covariance structure (cultivars are related)

Linear model for barley example Y ij =  + t i + u j + e ij t i = effect of i th set of environments u j = effect of j th cultivar In matrix notation: Y = X  + Zu + e

Weighted regression Y = X  + ε b = (X’X) -1 X’Y Where ε ij ~N (0, σ 2 ) When ε ij ~N (0, Rσ 2 ) Then b = (X’R -1 X) -1 X’R -1 Y For the barley example

Covariance structure of random effects MorexRobustExcelStander Morex11/27/1611/32 Robust127/3243/64 Excel191/128 Stander1 r = 2  XY Remember 217/811/ /1643/32 7/827/16291/64 11/1643/3291/642  XY

Mixed Model Equations X’R -1 XX’R -1 ZX’R -1 Y Z’R -1 XZ’R -1 Z + A -1 (σ ε 2 /σ A 2 )Z’R -1 Y Rσ2Rσ2 = each matrix is composed of submatrices the algebra is the same Calculations in Excel

Results from BLUP 11 Set 22 Set u1u1 Morex-0.33 u2u2 Robust-0.17 u3u3 Excel0.18 u4u4 Stander0.36 Original data BLUP estimates For fixed effects b 1 =  + t 1 b 2 =  + t 2

Interpretation from BLUP 11 Set 22 Set u1u1 Morex-0.33 u2u2 Robust-0.17 u3u3 Excel0.18 u4u4 Stander0.36 BLUP estimates For a set of recombinant inbred lines from an F 2 cross of Excel x Stander Predicted mean breeding value = ½( ) = 0.27

Shrinkage estimators In the simplest case (all data balanced, the only fixed effect is the overall mean, inbreds unrelated) If h 2 is high, BLUP values are close to the phenotypic values If h 2 is low, BLUP values shrink towards the overall mean For unrelated inbreds or families, ranking of genotypes is the same whether one uses BLUP or phenotypic values

Sampling error of BLUP Diagonal elements of the inverse of the coefficient matrix can be used to estimate sampling error of fixed and random effects X’R -1 XX’R -1 ZX’R -1 Y Z’R -1 XZ’R -1 Z + A -1 (σ ε 2 /σ A 2 )Z’R -1 Y Rσ2Rσ2 = invert the matrix C 11 C 12 C 21 C 22 coefficient matrix each element of the matrix is a matrix

Sampling error of BLUP fixed effects random effects

Estimation of Variance Components (would really need a larger data set) 1. Use your best guess for an initial value of σ ε 2 /σ A 2 2. Solve for  and û 3. Use current solutions to solve for σ ε 2 and then for σ A 2 4. Calculate a new σ ε 2 /σ A 2 5. Repeat the process until estimates converge ˆ

BLUP for single-crosses G B73,Mo17 = GCA B73 + GCA Mo17 + SCA B73,Mo17 Performance of a single cross: BLUP Model Sets of environments are fixed effects GCA and SCA are considered to be random effects Y = X  + Ug 1 + Wg 2 + Ss + e Example in Bernardo, pg 277 from Hallauer et al., 1996

Performance of maize single crosses Iowa Stiff Stalk x Lancaster Sure Crop

Covariance of single crosses SC-X is jxkSC-Y is j’xk’ B73, B84, H123MO17, N197 assuming no epistasis

Covariance of single crosses SC-X is jxkSC-Y is j’xk’ SC-1=B73xMO17SC-2=H123xMO17 SC-3=B84xN197

Solutions X