Download presentation
Presentation is loading. Please wait.
Published bySudomo Kurniawan Modified over 6 years ago
1
Robert J. Tempelman Michigan State University
Slides available at Accounting for heterogeneous variances (heteroskedasticity) in genetic evaluations National Animal Breeding Seminar Series Fall Semester 2004 Robert J. Tempelman Michigan State University
2
A typical genetic evaluation model for postweaning gain (PWG)
y = X1b1 + X2b2 + Z1u1 +Z2u2+ e Fixed effects Random effects Random contemporary group effects: u1 Var (u1) -> autoregressive ys within herds or NIID Non-genetic effects: b1 (age of dam, length of PW pd, calf sex) Random additive genetic effects: u2 Var(u2) -> function of one or more (multibreed) components Genetic effects: b2 (Breed and dominance and recombination loss effects) y = Xb + Zu + e ??????? 9/20/2018
3
Homoskedastic error models
e ~N (0,Ise2) Common s2e across environments, factors, etc. may not be a suitable assumption. 9/20/2018
4
Example of Heterogeneous Variances
Garrick et al. (1989) Separate genetic (s2g) and residual (s2e) variances estimated by %Simmental and sex for postweaning gain. Genetic Residual s2e s2g 9/20/2018
5
Structural (mixed effects) modeling of variances (Foulley et al
model residual and genetic variances as a function of fixed and random effects Example: Consider the residual variance unique to fixed calf sex j and random CG k. Log linear “mixed effects” model on log variance Antilog both sides (Multiplicative model) 9/20/2018
6
First known application of structural variance model to beef cattle data
San Cristobal et al. (1993) analyzing muscular development scores in French Maine Anjou cattle Scored on 0 to 100 scale. Considered structural variance model on both residual AND genetic variances. Effects considered: classifier (random), condition score (fixed), year (random), month(random) for residual variance Sex for genetic variance 9/20/2018
7
Representative results from San Cristobal (multiplicative scale)
Factor Level Estimate Baseline 1 97.57 Classifier 1.17 2 1.07 3 1.06 Condition Score 0.74 0.65 Year 0.98 1.14 Month 0.94 1.02 1.00 For example, an animal evaluated by Classifier 2 with condition score 2 born in year 1 and month 2 has a residual variance of: 97.57 x1.07 x0.74 x0.98 x1.02 =77.23 9/20/2018
8
The underlying model for calving ease (1-5 scale)
Colored areas = probability of occurence 1= Unassisted calving 5= Caesarean Section 1 2 3 4 5 9/20/2018 (l)
9
Heterogeneous variances for calving ease (CE)?
Genetic evaluations based on threshold mixed effects model. Underlying liability (l) is typically modeled as a function of fixed (e.g. calf sex) and random effects (herd-year-season) + IID residual (e); i.e. Heteroskedastic theory provided by Foulley and Gianola (1996) Demonstrated that statistically significant calf sex by age of dam interactions for CE in homoskedastic error threshold models may be an artifact of heterogeneous residual variances 9/20/2018
10
ALLOWING FOR HETEROGENEOUS RESIDUAL VARIANCES IN THRESHOLD MODELS
1 2 3 4 5 Note how probability of extreme outcomes particularly depend on residual variance 9/20/2018
11
Genetic evaluations accounting for calving ease
French Holstein, Normande, and Montbeliarde breeds (Ducrocq, 2000) Heteroskedasticity is breed dependent: ~15% lower residual variance in winter versus summer. Larger residual variance ( x) for male calves. Italian Holsteins (Canavesi et al., 2003) Larger residual variance (1.03) for males Regional differences for residual variance Both evaluations only consider fixed effects models for residual variances 9/20/2018
12
Fixed and random effects for log residual variances in threshold models for calving ease
Kizilkaya and Tempelman (2005; GSE) First parity Italian Piedmontese cattle Parameter Linear Mixed Model Analysis of Birth Weights Threshold Mixed Model Analysis of Calving Ease Estimate ± SE Sire Variance 1.13 0.20 0.13 0.02 MGS Variance 0.50 0.11 0.02 0.01 Sire-MGS covariance 0.35 0.11 CG variance 1.68 0.19 Male residual variance 14.44 1.03 1.09 0.09 Female residual variance 10.19 0.73 0.71 0.06 Sex difference in residual variances 4.26 0.53 0.38 0.05 CV for herd-specific variances 0.60 0.09 0.74 0.14 F R 9/20/2018 Fixed effects and Random effects for Residual Heteroskedasticity
13
Estimates (• )of and 95% credible sets ( ) for Herd Specific Variances for CE Relative to Baseline (1.0) Note: Because sire-mgs model was used, residual heteroskedasticity may be partly genetic CV = 0.74 9/20/2018
14
Impact on calving ease EPD’s? Heteroskedastic vs. Homoskedastic Error
9/20/2018
15
All of Sire’s A progeny were from Herd 66
Impact of residual heteroskedasticity across CG on Sire EPD’s for birthweights (Kizilkaya and Tempelman, 2005) CV = 0.60 Implications of ranking herds for product uniformity! Herd 66 Sire A All of Sire’s A progeny were from Herd 66 9/20/2018
16
Multiple Breed Populations
Might naturally expect heterogeneous genetic variances (for different breedgroups and different levels of heterozygosity) 9/20/2018
17
Multibreed genetic modeling
Additive model (Lo et al., 1993) For any individual j, its additive genetic effect aj has variance: Expected allelic contribution due to Breed b in individual/parent j Additive genetic variance of Breed b Variance due to genetic segregation between Breeds b and b’ 9/20/2018
18
Simple two breed example
Suppose P2 P1 F1 Theory used for QTL mapping in pig breed crosses: better power than Haley-Knott regression (Perez-Enciso and Varona, 2000) F2 9/20/2018
19
Application:Nelore-Hereford data (Fernando Cardoso PhD)
Data set: 22,717 post-weaning gain (PWG) records on Hereford and Nelore x Hereford calves raised in Brazil (from ) 40,082 animals (including ancestors in pedigree file) Breed compositions of animals with records ranged from purebred Hereford to 7/8 Nelore Purebred Herefords and F1’s represent 90% of the data 9/20/2018
20
9/20/2018
21
But maybe the residual variances are heterogeneous too!
Beef cattle performance is recorded across diverse production systems and environments, with data quality often compromised by, e.g. Recording error, preferential treatment, disease, etc. Hierarchical model constructions have been independently used to address heteroskedasticity (Foulley et al., 1992; SanCristobal et al., 1993) and robustness to outliers (Stranden and Gianola, 1998, 1999). Important to discern outliers from high-variance subclasses 9/20/2018
22
First stage: Specify the Linear Mixed Model
y = X1b1 + X2b2 + Z1u1 +Z2u2+ e Fixed effects Random effects Non-genetic effects: b1 (age of dam, length of PW period, calf sex) Random contemporary group effects: u1 Genetic effects: b2 (Breed additive, dominance and recombination loss effects) Random additive genetic effects: u2 y = Xb + Zu + e OR 9/20/2018
23
Second stage: Structural variance model
baseline Regression parameters Fixed classification effects Random classification effects Lack-of-fit term with mean 0 Breed proportion EXAMPLES Breed heterozygosity Calf sex CG 9/20/2018
24
Distributional assumptions on random effects
Location parameters: u includes 940 CG (uCG) and 40,082 additive genetic effects (uA): uCG ~ N(0,Is2CG) uA ~ N(0,G(f)) where f includes breed specific variances and segregation variances. Residual variance v = [v1 v2 v940] includes random relative variances for 940 CG vi ~ IID Inverted-gamma with mean 1 and standard deviation sv 9/20/2018
25
Need to consider one more thing
Recall What about wj? Lack-of-fit term where 9/20/2018
26
1) If wj ~ Gamma(n/2, n/2) then this is equivalent to specifying:
2) If wj = 1 for all j, then i.e. Student t error Demonstrated to be resistant to outliers Stranden and Gianola (1998; 1999) Many other options!!! See Rosa et al. (2003) 9/20/2018
27
Now (At least) four distributional possibilities!
2 × 2 factorial based on distribution (normal versus Student t) and homoskedastic versus heteroskedastic residuals : Homoskedastic normal Homoskedastic Student t Heteroskedastic normal Heteroskedastic Student t 9/20/2018
28
Some results Based on Pseudo Bayes Factors (PBF), the Student t heteroskedastic model provided the best data fit; the homoskedastic normal model the worst data fit. The heteroskedastic Student t error model was the best fit: The posterior mean of the degrees of freedom parameter (n) was 7.33 ± 0.48 indicating a heavier tailed residual distribution than normal (n =∞) for PWG data 9/20/2018
29
Heteroskedastic residual variance results from
Fixed effects Parameter EST. SE 95%PPI Gender (t1) 1.13 0.09 (0.97, 1.31) Nelore proportion (g1) 1.15 0.45 (0.48, 2.20) Heterozygosity (g2) 0.70 0.16 (0.46, 1.06) CG (sn) 0.72 0.06 (0.62,0.86) Random effects Evidence of genetic homeostasis? (Lerner, 1954) 9/20/2018
30
What do these estimates mean again?
Example a male F1 calf in a herd (Herd 5) with above average variability ( ) Nelore proportion Heterozygosity Estimated residual variability: 9/20/2018
31
Posterior densities of heritabilities under homoskedastic normal error model
Cardoso and Tempelman, 2004 9/20/2018
32
Posterior densities of heritabilities under heteroskedastic normal error model
Some of most variable herds were exclusively Herefords Why the “flip flop” from homoskedastic normal error? ->Some of most variable herds were exclusively Herefords Why the “flip flop” 9/20/2018 Posterior densities look very similar under Student t heteroskedastic
33
Where do we go from here? Genetic evaluation for residual variability?
Relevance: Uniformity of product premium. San Cristobal-Gaudy et al. (1998, 2001) Sorensen and Waagepeterson (2003) A: numerator relationship matrix r: genetic correlation between location and log variance effects 9/20/2018
34
Litter size in sheep (San Cristobal et al., 2003)
For litter size in pigs, a negative was estimated (Sorensen and Waagespeterson, 2003) Sire EPD for litter size variability (v) r Sire EPD for litter size (u) 9/20/2018
35
Multiple trait analysis?
The standard for genetic evaluations today Perhaps genetic covariances/correlations between traits are heterogeneous across environments too. Hopefully, these issues will be investigated further. 9/20/2018
36
References Cardoso, F.F., and R.J. Tempelman Hierarchical Bayes multiple-breed inference with an application to genetic evaluation of a Nelore-Hereford population. Journal of Animal Science 82: Canavesi F., Biffani S., Samore A.B., Revising the genetic evaluation for calving ease in the Italian Holstein Friesian. Interbull Bulletin 30 (2003) Ducrocq V., Calving ease evaluation of French dairy bulls with a heteroskedastic threshold model with direct and maternal effects, Interbull Bulletin 30 (2000) Foulley, J.L ECM approaches to heteroskedastic mixed models with constant variance ratios. Genetics, Selection, Evolution 29: Foulley, J. L., M. S. Cristobal, D. Gianola, and S. Im Marginal likelihood and Bayesian approaches to the analysis of heterogeneous residual variances in mixed linear Gaussian models. Computational Statistics & Data Analysis 13: Foulley J.L., Gianola D., Statistical analysis of ordered categorical data via a structural heteroskedastic threshold model, Genetics Selection Evolution 28 (1996) Garrick, D.J., E.J. Pollak, R.L. Quaas, and L.D. Van Vleck Variance heterogeneity in direct and maternal weight traits by sex and percent purebred for Simmental-sired calves. Journal of Animal Science 67: Kachman, S.D. and R.W. Everett A multiplicative model when the variances are heterogeneous. Journal of Dairy Science 76: Kizilkaya, K., and R.J. Tempelman A general approach to mixed effects modeling of residual variances in generalized linear mixed models. Genetics, Selection, Evolution (in press) Lo, L. L., R. L. Fernando, and M. Grossman Covariance between relatives in multibreed populations - additive-model. Theoretical and Applied Genetics 87: Mark, T Applied genetic evaluations for production and functional traits in dairy cattle. Journal of Dairy Science 87: Meuwissen, T.H.E., G. DeJong, and B. Engel Joint estimation of breeding values and heterogeneous variances of large data files. Journal of Dairy Science 79: Perez-Enciso, M., and L. Varona Quantitative Trait Loci Mapping in F2 Crosses Between Outbred Lines. Genetics 155: 9/20/2018
37
References (cont’d) Robinson G.K., That BLUP is a good thing - the estimation of random effects, Statistical Science Robert-Granie, C., B. Bonati, D. Boichard, and A. Barbat Accounting for variance heterogeneity in French dairy cattle genetic evaluation. Livestock Production Science 60: Robert-Granie, C. B. Heude, and J.L. Foulley Modeling the growth curve of Maine-Anjou beef cattle using heteroskedastic random coefficients models. Genetics, Selection, Evolution 43: Rodriguez-Almeida, F. A., L. D. Vanvleck, L. V. Cundiff, and S. D. Kachman Heterogeneity of variance by sire breed, sex, and dam breed in 200-day and 365-day weights of beef-cattle from a top cross experiment. Journal of Animal Science 73: Rosa, G. J. M., C. R. Padovani, and D. Gianola Robust linear mixed models with normal/independent distributions and Bayesian mcmc implementation. Biometrical Journal 45: San Cristobal, M., J. L. Foulley, and E. Manfredi Inference about multiplicative heteroskedastic components of variance in a mixed linear gaussian model with an application to beef-cattle breeding. Genetics Selection Evolution 25: 3-30. San Cristobal-Gaudy, J.M. Elsen, L. Bodin, and C.Chevalet Prediction of the response to a selection for canalisation of a continuous trait in animal breeding. Genetics, Selection, Evolution 30: San Cristobal-Gaudy, M., Bodin, L., Elsen, J-.M., Chevalet, C Genetic components of litter size variability in sheep, Genetics Selection Evolution 33: Sorensen D.A., Waagepetersen R., Normal linear models with genetically structured residual heterogeneity: a case study. Genetical Research Cambr Stranden, I. and D. Gianola Attenuating effects of preferential treatment with Student t mixed linear models: A simulation study. Genetics, Selection, Evolution 30: Stranden, I. and D. Gianola, Mixed effects linear models with t-distributions for quantitative genetic analysis: A Bayesian approach. Genetics, Selection, Evolution 31:25-42. 9/20/2018
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.