Added value of whole-genome sequence data to genomic predictions in dairy cattle Rianne van Binsbergen 1,2, Mario Calus 1, Chris Schrooten 3, Fred van.

Slides:



Advertisements
Similar presentations
Reliable genomic evaluations across breeds and borders Sander de Roos CRV, the Netherlands.
Advertisements

Aaron Lorenz Department of Agronomy and Horticulture
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Genomic imputation and evaluation using 1074 high density Holstein genotypes P. M. VanRaden 1, D. J. Null 1 *, G.R. Wiggans 1, T.S. Sonstegard 2, E.E.
Use of Quantitative Trait Loci (QTL) in Dairy Sire Selection Fabio Monteiro de Rezende Universidade Federal Rural de Pernambuco (UFRPE) - Brazil.
Bull selection based on QTL for specific environments Fabio Monteiro de Rezende Universidade Federal Rural de Pernambuco (UFRPE) - Brazil.
Computational Complexity The complexity of the MG model for a single SNP is determined by the complexity of the matrix operations in formulas used to iteratively.
Genomic selection in animal breeding- A promising future for faster genetic improvement in livestock Dr Indrasen Chauhan Scientist, CSWRI, Avikanagar Tonk
G.R. Wiggans 1, T.S. Sonstegard 1, P.M. VanRaden 1, L.K. Matukumalli 1,2, R.D. Schnabel 3, J.F. Taylor 3, J.P. Chesnais 4, F.S. Schenkel 5, and C.P. Van.
Wiggans, 2013RL meeting, Aug. 15 (1) Dr. George R. Wiggans, Acting Research Leader Bldg. 005, Room 306, BARC-West (main office);
Training, Validation, and Target Populations Training, Validation, and Target Populations Mark Thallman, Kristina Weber, Larry Kuehn, Warren Snelling,
How Genomics is changing Business and Services of Associations Dr. Josef Pott, Weser-Ems-Union eG, Germany.
What’s coming next in genomics? Ben Hayes, Department of Primary Industries, Victoria, Australia.
Extension of Bayesian procedures to integrate and to blend multiple external information into genetic evaluations J. Vandenplas 1,2, N. Gengler 1 1 University.
2007 Paul VanRaden 1, Curt Van Tassell 2, George Wiggans 1, Tad Sonstegard 2, Jeff O’Connell 1, Bob Schnabel 3, Jerry Taylor 3, and Flavio Schenkel 4,
Mating Programs Including Genomic Relationships and Dominance Effects
Mating Programs Including Genomic Relationships and Dominance Effects Chuanyu Sun 1, Paul M. VanRaden 2, Jeff R. O'Connell 3 1 National Association of.
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
WiggansARS Big Data Workshop – July 16, 2015 (1) George R. Wiggans Animal Genomics and Improvement Laboratory Agricultural Research Service, USDA Beltsville,
Impacts of inclusion of foreign data in genomic evaluation of dairy cattle K. M. Olson 1, P. M. VanRaden 2, D. J. Null 2, and M. E. Tooker 2 1 National.
2007 Paul VanRaden Animal Improvement Programs Lab, Beltsville, MD 2011 Avoiding bias from genomic pre- selection in converting.
2007 Paul VanRaden, Curt Van Tassell, George Wiggans, Tad Sonstegard, and Jeff O’Connell Animal Improvement Programs Laboratory and Bovine Functional Genomics.
Wiggans, th WCGALP (1) G.R. Wiggans*, T.A. Cooper, D.J. Null, and P.M. VanRaden Animal Genomics and Improvement Laboratory Agricultural Research.
An Efficient Method of Generating Whole Genome Sequence for Thousands of Bulls Chuanyu Sun 1 and Paul M. VanRaden 2 1 National Association of Animal Breeders,
Bovine Genomics The Technology and its Applications Gerrit Kistemaker Chief Geneticist, Canadian Dairy Network (CDN) Many slides were created by.
2007 Paul VanRaden and Mel Tooker Animal Improvement Programs Laboratory, USDA Agricultural Research Service, Beltsville, MD, USA
2007 Paul VanRaden, Mel Tooker, Jan Wright, Chuanyu Sun, and Jana Hutchison Animal Improvement Programs Lab, Beltsville, MD National Association of Animal.
2007 Paul VanRaden, George Wiggans, Jeff O’Connell, John Cole, Animal Improvement Programs Laboratory Tad Sonstegard, and Curt Van Tassell Bovine Functional.
Genetic Evaluation of Lactation Persistency Estimated by Best Prediction for Ayrshire, Brown Swiss, Guernsey, and Milking Shorthorn Dairy Cattle J. B.
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (1) J. R. O’Connell 1 and P. M. VanRaden 2 1 University of Maryland School of Medicine,
2007 Paul VanRaden Animal Improvement Programs Lab, USDA, Beltsville, MD, USA Pete Sullivan Canadian Dairy Network, Guelph, ON, Canada
Paul VanRaden, 1 Katie Olson, 2 Dan Null, 1 Mehdi Sargolzaei, 3 Marco Winters, 4 and Jan-Thijs van Kaam 5 1 Animal Improvement Programs Laboratory, ARS,
J. B. Cole * and P. M. VanRaden Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD
2007 Melvin Tooker Animal Improvement Programs Laboratory USDA Agricultural Research Service, Beltsville, MD, USA
John B. Cole Animal Improvement Programs Laboratory Agricultural Research Service, USDA, Beltsville, MD Best prediction.
G.R. Wiggans* and P.M. VanRaden Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD
Host disease genetics: bovine tuberculosis resistance in
J. B. Cole 1,*, P. M. VanRaden 1, and C. M. B. Dematawewa 2 1 Animal Improvement Programs Laboratory, Agricultural Research Service, USDA, Beltsville,
2007 Paul VanRaden Animal Improvement Programs Laboratory USDA Agricultural Research Service, Beltsville, MD, USA
WiggansARS Big Data Computing Workshop (1) 2013 George R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville,
April 2010 (1) Prediction of Breed Composition & Multibreed Genomic Evaluations K. M. Olson and P. M. VanRaden.
Council on Dairy Cattle Breeding April 27, 2010 Interpretation of genomic breeding values from a unified, one-step national evaluation Research project.
2007 Paul VanRaden and Melvin Tooker* Animal Improvement Programs Laboratory 2010 Gains in reliability from combining subsets.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
P.M. VanRaden and D.M. Bickhart Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA
2007 Paul VanRaden Animal Improvement Programs Lab, Beltsville, MD Iterative combination of national phenotype, genotype, pedigree,
Multi-trait, multi-breed conception rate evaluations P. M. VanRaden 1, J. R. Wright 1 *, C. Sun 2, J. L. Hutchison 1 and M. E. Tooker 1 1 Animal Genomics.
Multibreed Genomic Evaluation Using Purebred Dairy Cattle K. M. Olson* 1 and P. M. VanRaden 2 1 Department of Dairy Science Virginia Polytechnic and State.
VISG – LARGE DATASETS Literature Review Introduction – Genome Wide Selection Aka Genomic Selection Set of Markers 10,000’s - enough to capture most genetic.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD G.R. WiggansADSA 18.
2007 Paul VanRaden Animal Improvement Programs Laboratory, USDA Agricultural Research Service, Beltsville, MD, USA 2008 New.
Multibreed Genomic Evaluations in Purebred Dairy Cattle K. M. Olson 1 and P. M. VanRaden 2 1 National Association of Animal Breeders 2 AIPL, ARS, USDA.
G.R. Wiggans Animal Improvement Programs Laboratory Agricultural Research Service, USDA Beltsville, MD Select Sires‘ Holstein.
2007 Paul VanRaden 1, Curt Van Tassell 2, George Wiggans 1, Tad Sonstegard 2, Bob Schnabel 3, Jerry Taylor 3, and Flavio Schenkel 4, Paul VanRaden 1, Curt.
Bull Selection Strategies Using Genomic Estimated Breeding Values L. R. Schaeffer CGIL, University of Guelph ICAR-Interbull Meeting Niagara Falls, NY June.
EAAP Meeting, Stavanger Estimation of genomic breeding values for traits with high and low heritability in Brown Swiss bulls M. Kramer 1, F. Biscarini.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Genomic Selection in Multi- Breed Dairy Cattle Populations John B. Cole Animal Genomics and Improvement Laboratory Agricultural Research Service, USDA.
Y. Masuda1, I. Misztal1, P. M. VanRaden2, and T. J. Lawlor3
The effect of using sequence data instead of a lower density SNP chip on a GWAS EAAP 2017; Tallinn, Estonia Sanne van den Berg, Roel Veerkamp, Fred van.
Distribution and Location of Genetic Effects for Dairy Traits
What are BLUP? and why they are useful?
Perspectives from Human Studies and Low Density Chip
Increased reliability of genetic evaluations for dairy cattle in the United States from use of genomic information Abstr.
Using Haplotypes in Breeding Programs
Development of Genomic GMACE
The Basic Genetic Model
Variación no-aditiva y selección genómica
Presentation transcript:

Added value of whole-genome sequence data to genomic predictions in dairy cattle Rianne van Binsbergen 1,2, Mario Calus 1, Chris Schrooten 3, Fred van Eeuwijk 2, Roel Veerkamp 1, Marco Bink 2 1 Animal Breeding & Genetics Centre, Wageningen UR (NL) 2 Biometris, Wageningen UR (NL) 3 CRV (cattle breeding company), Arnhem (NL)

Genomic Prediction in agricultural species Goddard & Hayes (2009) Nature Reviews Genetics 10:381 Reference population: 1)Estimate effects for each SNP (w) 2)Generate a prediction equation that combines all the marker genotypes with their effects to predict the breeding value of each individual Each SNP represented by a variable (x), which takes the values 0 [A A] 1 [A B] 2 [B B] Apply prediction equation to a group of individuals that have genotypes but not phenotypes  Estimated genomic breeding values  Select the best individuals for breeding Advantages: Select at early age (before phenotypes available) Save costs to phenotype candidates Increase accuracy of predicted Breeding Values

One seminal paper on Genomic Prediction Dense marker maps SNP markers at 1cM density Prediction Accuracy Least Squares method: 0.32 Genomic BLUP method: 0.73 Bayesian methods(A,B):0.85 Conclusion: “selection on genetic values predicted from markers could substantially increase the rate of genetic gain in animals and plants, especially if combined with reproductive techniques to shorten the generation interval” Simulation Study

Another (seminal) paper on Genomic Prediction “Only few SNPs were useful for predicting the trait [because they were in linkage disequilibrium (LD) with mutations causing variation in the trait] while many SNPs were not useful.” Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps T. H. E. Meuwissen,* B. J. Hayes† and M. E. Goddard†,‡ Higher accuracy in genomic predictions since causal mutation is included (assumption)  No dependency on LD  Persistency across generations  Genomic prediction across breeds “In the case of whole-genome sequence data, the polymorphisms that are causing the genetic differences between the individuals are among those being analyzed.”

Genomic predictions from whole-genome sequence data  Tremendous increase in number of SNPs (more noise)  Large (sequence) data are required Solution  Sequence core set of individuals (e.g. founders)  Impute whole-genome sequence genotypes of other individuals Accuracy of imputation to whole-genome sequence data was generally high for imputation from 777K SNP panel Van Binsbergen, et al. Genet Sel Evol 2014 (in press) This presentation: First results of genomic prediction with imputed whole-genome sequence data for 5503 bulls with accurate phenotypes

Dataset: SNP genotypes & trait phenotypes 1000 bull genomes project 28M SNP genotypes De-regressed progeny based proofs (DRP 1 ) and associated effective daughter contributions (EDC 2 )  Somatic cell score (SCS)  Interval fist and last insemination (IFL)  Protein yield (PY) 1 VanRaden et al (J Dairy Sci) VanRaden et al (J Dairy Sci) 2 VanRaden and Wiggans 1991 (J Dairy Sci) VanRaden and Wiggans 1991 (J Dairy Sci) 5503 Holstein Friesian bulls 777K SNP genotypes (Illumina BovineHD BeadChip) 5503 Holstein Friesian bulls 12M SNP genotypes MAF > Imputation accuracy > 0.05 Imputation - Beagle v4 software 429 bulls (multiple breeds)

Prediction reliability Validation population  Youngest bulls with EDC  0  Mainly sons of bulls in training population  Mimics breeding practice = squared correlation between original phenotype (DRP) and estimated genetic values (GEBV) 5503 Holstein Friesian bulls 777K SNP genotypes (Illumina BovineHD BeadChip) 5503 Holstein Friesian bulls 12M SNP genotypes MAF > Imputation accuracy > 0.05 training population validation population 4322 old bulls 1181 young bulls training population validation population 4322 old bulls 1181 young bulls differences?

Genomic prediction – 2 methods GBLUP  Genome-enabled best linear unbiased prediction  Distribution QTL effects to be close to infinitesimal model (all SNPs equally small effect)  Build a genomic relationship matrix to model variance- covariance structure BSSVS  Bayes stochastic search variable selection  Large number of SNPs with tiny (close to zero) and a few SNPs with moderate effects (=mixture of two Normal distributions) Implementation via Markov chain Monte Carlo (MCMC) simulation algorithms (computer intensive) Calus M (2014). Right-hand-side updating for fast computing of genomic breeding values. Genetics Selection Evolution 46(1): chains of 60,000 cycles (10,000 cycles burn-in)

Computation GBLUP ● HPC – 1 node ● ~ 3 hours ● ~ 32 GB RAM ● HPC – 12 nodes ● ~ 6 hours ● ~ 600 GB RAM BSSVS (per MCMC chain) ● Windows – 1 CPU ● ~ 5 days ● ~ 1.6 GB RAM ● HPC – 1 node ● ~ 50 days ● ~ 32 GB RAM 777K SNP 12M SNP Windows 7 Enterprise desktop pc: 32 CPU – 8 GB RAM/CPU (clock speed 2.60 GHz) HPC Linux cluster: Normal nodes – 64 GB/node (2.60 GHz); 2 fat nodes – 1 TB RAM/node (2.20 GHz) 3 chains of 60,000 cycles (10,000 cycles burn-in)

Results: Prediction Reliability * Based on 45,000 cycles BSSVS: Average over 3 chains of 60,000 cycles (10,000 cycles burn-in)

Results: Prediction Reliability * Based on 45,000 cycles

BSSVS: Convergence & SNP effects Sequence: 45,000 cycles 3 chains of 60,000 cycles (10,000 cycles burn-in) Trace of variance of SNP effects Bayes Factor for SNP effects 777K SNP 12M SNP

Suitability of BSSVS model?  Large number of SNPs with tiny and a few SNPs with moderate effects ● Sequence data: Really large number of SNPs with tiny effects  Captures too much signal?  Another Bayesian Prediction Model: Bayes-C ● Large number of SNPs with NO effect and a few SNPs with moderate effects

Concentrate on single chromosome (BTA 6) 777K SNP 12M SNP BSSSVS Bayes-C MCMC convergence

Concentrate on single chromosome (BTA 6) 777K SNP 12M SNP Reliability estimates BSSSVS Bayes-C BSSVSBayesC BovineHD0.328 Sequence Signal of QTL effects

Conclusions  Genomic prediction using sequence data becomes reality ● However, sequence data requires intensive computation  Need for faster algorithms  Use of Sequence Data did not improve Prediction reliability ● Convergence issues with BSSVS  Longer chains may yield better results  BSSVS slightly better compared to GBLUP  Preliminary results BTA6 hint that Bayes-C method may work better (than BSSVS) for sequence data Next Steps: Did we bet on the wrong horse - named BSSVS?  Review choice of priors in BSSVS model.  Apply Bayes-C model to whole genome sequence data

Thanks! 1000 bull genomes project ( Acknowledgments

De-regressed proofs (DRP) Effective daughter contribution (EDC) Published reliability of EBV VanRaden and Wiggans 1991 (J Dairy Sci)VanRaden et al (J Dairy Sci) Parent average Estimated breeding value Effective Daughter Contribution