VISG – LARGE DATASETS Literature Review
Introduction – Genome Wide Selection Aka Genomic Selection Set of Markers 10,000’s - enough to capture most genetic information ‘Training set’ of animals phenotyped and genotyped representative of industry Predictor Over-specified – e.g variables, 1000 individuals Robust model selection required Application Predict in selection candidates –Maybe no phenotypes –Maybe no pedigrees
Introduction – Genome Wide Selection Prediction Methods Stepwise Regression gBLUP –Fit all markers as a random effect –g i ~ N(0, g 2 ) BayesA –g i ~ N(0, gi 2 ) –prior : gi 2 ~ S/ 2 (choose S and ) BayesB –similar to BayesA, except –proportion of effects are zero Most investigations compare these Many variations (sometimes with the same name)
Literature Dairy applications review (Hayes et al., 2009) GWS in crops (Heffner, Sorrells, Jannick, 2009) Prediction in unrelateds (Meuwissen, 2009) Marker panels (Habier, 2009) Phenotypes (Harris & Johnson, unpub) +...
Issues National evaluations Long term gains LD or relationship tracking Multiple breeds Distance from Training to Application Marker Panels (subsets) Phenotypes (EBV-based) Non-additive effects Computing requirements
Methods gBLUP almost as good as Bayes(A) (dairy) Interpretation(?): many genes of small effect Bayes methods better at using real LD (vs relatedness) Bayes(B) advantage greater with Higher marker density Higher Training Application distance Smaller Training set Mixture of 2 normals ~ BayesB Partial Least Squares Machine Learning Haplotype methods not used in practice yet
Marker Panels Evenly spaced panels Track inheritance from parents (both SNP-chipped) Will work with new traits Lasso methods popular Shrinks small effects to zero
Other Combining marker and other information Phenotype info, parent info Index methods; ‘blending’ Important for seamless national evaluations Computing strategies Tricks to reduce computation Approximation rather than Iterative (MCMC) methods
Online resources Conferences Statistical Genetics of Livestock for the Post-Genomic Era. UW-Madison, May, QTL/MAS Workshops. 2008: : Courses Whole Genome Association and Genomic Selection. September 1-8, 2008, Salzburg, Austria. Use of High-density SNP Genotyping for Genetic Improvement of Livestock. Iowa State, June,
Toy example 5 SNP / 1000 individuals y = mu + SNP1 + e – mu = 10 – SNP1 substitution effect = 10 / p = 0.5 – Var(e) = 1 1 block / 1000 iterations Runs in ~ 5 secs