VISG – LARGE DATASETS Literature Review Introduction – Genome Wide Selection Aka Genomic Selection Set of Markers 10,000’s - enough to capture most genetic.

Slides:



Advertisements
Similar presentations
The genetic dissection of complex traits
Advertisements

Phenotypes for training and validation of genome wide selection methods K G DoddsAgResearch, Invermay B AuvrayAgResearch, Invermay P R AmerAbacusBio, Dunedin.
Association Tests for Rare Variants Using Sequence Data
Aaron Lorenz Department of Agronomy and Horticulture
METHODS FOR HAPLOTYPE RECONSTRUCTION
Added value of whole-genome sequence data to genomic predictions in dairy cattle Rianne van Binsbergen 1,2, Mario Calus 1, Chris Schrooten 3, Fred van.
Matt Spangler University of Nebraska- Lincoln DEVELOPMENT OF GENOMIC EPD: EXPANDING TO MULTIPLE BREEDS IN MULTIPLE WAYS.
Perspectives from Human Studies and Low Density Chip Jeffrey R. O’Connell University of Maryland School of Medicine October 28, 2008.
Multiple Regression Involves the use of more than one independent variable. Multivariate analysis involves more than one dependent variable - OMS 633 Adding.
Today Introduction to MCMC Particle filters and MCMC
Lasso regression. The Goals of Model Selection Model selection: Choosing the approximate best model by estimating the performance of various models Goals.
Genomic selection in animal breeding- A promising future for faster genetic improvement in livestock Dr Indrasen Chauhan Scientist, CSWRI, Avikanagar Tonk
1 Chapter 17: Introduction to Regression. 2 Introduction to Linear Regression The Pearson correlation measures the degree to which a set of data points.
How Genomics is changing Business and Services of Associations Dr. Josef Pott, Weser-Ems-Union eG, Germany.
Mating Programs Including Genomic Relationships and Dominance Effects
Mating Programs Including Genomic Relationships and Dominance Effects Chuanyu Sun 1, Paul M. VanRaden 2, Jeff R. O'Connell 3 1 National Association of.
Chuanyu Sun Paul VanRaden National Association of Animal breeders, USA Animal Improvement Programs Laboratory, USA Increasing long term response by selecting.
The Complexities of Data Analysis in Human Genetics Marylyn DeRiggi Ritchie, Ph.D. Center for Human Genetics Research Vanderbilt University Nashville,
2007 Paul VanRaden and Mel Tooker Animal Improvement Programs Laboratory, USDA Agricultural Research Service, Beltsville, MD, USA
Multiple Linear Regression. Purpose To analyze the relationship between a single dependent variable and several independent variables.
Input: A set of people with/without a disease (e.g., cancer) Measure a large set of genetic markers for each person (e.g., measurement of DNA at various.
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (1) J. R. O’Connell 1 and P. M. VanRaden 2 1 University of Maryland School of Medicine,
2007 Melvin Tooker Animal Improvement Programs Laboratory USDA Agricultural Research Service, Beltsville, MD, USA
Bayesian integration of external information into the single step approach for genomically enhanced prediction of breeding values J. Vandenplas, I. Misztal,
Host disease genetics: bovine tuberculosis resistance in
Session 4 – Part 1 Status update on genomically enhanced genetic evaluation by breeds Dr. Matt Spangler, University of Nebraska-Lincoln.
P. M. VanRaden and T. A. Cooper * Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA
April 2010 (1) Prediction of Breed Composition & Multibreed Genomic Evaluations K. M. Olson and P. M. VanRaden.
Council on Dairy Cattle Breeding April 27, 2010 Interpretation of genomic breeding values from a unified, one-step national evaluation Research project.
The International Consortium. The International HapMap Project.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
2007 Paul VanRaden 1, Jeff O’Connell 2, George Wiggans 1, Kent Weigel 3 1 Animal Improvement Programs Lab, USDA, Beltsville, MD, USA 2 University of Maryland.
2007 Paul VanRaden Animal Improvement Programs Lab, Beltsville, MD Iterative combination of national phenotype, genotype, pedigree,
CS Statistical Machine learning Lecture 12 Yuan (Alan) Qi Purdue CS Oct
Multibreed Genomic Evaluation Using Purebred Dairy Cattle K. M. Olson* 1 and P. M. VanRaden 2 1 Department of Dairy Science Virginia Polytechnic and State.
Whole genome selection and the 2000 bull project at USMARC Larry Kuehn Research Geneticist.
Multibreed Genomic Evaluations in Purebred Dairy Cattle K. M. Olson 1 and P. M. VanRaden 2 1 National Association of Animal Breeders 2 AIPL, ARS, USDA.
2011 Data Mining Industrial & Information Systems Engineering Pilsung Kang Industrial & Information Systems Engineering Seoul National University of Science.
Strategies to Incorporate Genomic Prediction Into Population-Wide Genetic Evaluations Nicolas Gengler 1,2 & Paul VanRaden 3 1 Animal Science.
Canadian Bioinformatics Workshops
Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.
EAAP Meeting, Stavanger Estimation of genomic breeding values for traits with high and low heritability in Brown Swiss bulls M. Kramer 1, F. Biscarini.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
EHS Lecture 14: Linear and logistic regression, task-based assessment
6. Kernel Regression.
Lecture 28: Bayesian Tools
Validation of €uro-Star Replacement Index.
Y. Masuda1, I. Misztal1, P. M. VanRaden2, and T. J. Lawlor3
Use of DNA information in Genetic Programs.
Workshop on Methods for Genomic Selection (El Batán, July 15, 2013) Paulino Pérez & Gustavo de los Campos.
Marker heritability Biases, confounding factors, current methods, and best practices Luke Evans, Matthew Keller.
Gene Hunting: Design and statistics
Washington State University
Roberto Battiti, Mauro Brunato
Genome-wide Associations
Genome-wide Association Studies
Complex Traits Qualitative traits. Discrete phenotypes with direct Mendelian relationship to genotype. e.g. black or white, tall or short, sick or healthy.
Correlation for a pair of relatives
Methods to compute reliabilities for genomic predictions of feed intake Paul VanRaden, Jana Hutchison, Bingjie Li, Erin Connor, and John Cole USDA, Agricultural.
Linear Model Selection and regularization
What are BLUP? and why they are useful?
Washington State University
Extending Mendelian Genetics
Perspectives from Human Studies and Low Density Chip
Using Haplotypes in Breeding Programs
Cancer as a Complex Genetic Trait
Precision animal breeding
The Basic Genetic Model
Presentation transcript:

VISG – LARGE DATASETS Literature Review

Introduction – Genome Wide Selection Aka Genomic Selection Set of Markers 10,000’s - enough to capture most genetic information ‘Training set’ of animals phenotyped and genotyped representative of industry Predictor Over-specified – e.g variables, 1000 individuals Robust model selection required Application Predict in selection candidates –Maybe no phenotypes –Maybe no pedigrees

Introduction – Genome Wide Selection Prediction Methods Stepwise Regression gBLUP –Fit all markers as a random effect –g i ~ N(0,  g 2 ) BayesA –g i ~ N(0,  gi 2 ) –prior :  gi 2 ~ S/  2 (choose S and ) BayesB –similar to BayesA, except –proportion  of effects are zero Most investigations compare these Many variations (sometimes with the same name)

Literature Dairy applications review (Hayes et al., 2009) GWS in crops (Heffner, Sorrells, Jannick, 2009) Prediction in unrelateds (Meuwissen, 2009) Marker panels (Habier, 2009) Phenotypes (Harris & Johnson, unpub) +...

Issues National evaluations Long term gains LD or relationship tracking Multiple breeds Distance from Training to Application Marker Panels (subsets) Phenotypes (EBV-based) Non-additive effects Computing requirements

Methods gBLUP almost as good as Bayes(A) (dairy) Interpretation(?): many genes of small effect Bayes methods better at using real LD (vs relatedness) Bayes(B) advantage greater with Higher marker density Higher Training  Application distance Smaller Training set Mixture of 2 normals ~ BayesB Partial Least Squares Machine Learning Haplotype methods not used in practice yet

Marker Panels Evenly spaced panels Track inheritance from parents (both SNP-chipped) Will work with new traits Lasso methods popular Shrinks small effects to zero

Other Combining marker and other information Phenotype info, parent info Index methods; ‘blending’ Important for seamless national evaluations Computing strategies Tricks to reduce computation Approximation rather than Iterative (MCMC) methods

Online resources Conferences Statistical Genetics of Livestock for the Post-Genomic Era. UW-Madison, May, QTL/MAS Workshops. 2008: : Courses Whole Genome Association and Genomic Selection. September 1-8, 2008, Salzburg, Austria. Use of High-density SNP Genotyping for Genetic Improvement of Livestock. Iowa State, June,

Toy example 5 SNP / 1000 individuals y = mu + SNP1 + e – mu = 10 – SNP1 substitution effect = 10 / p = 0.5 – Var(e) = 1 1 block / 1000 iterations Runs in ~ 5 secs