Download presentation
Presentation is loading. Please wait.
Published byErica Lewis Modified over 9 years ago
1
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (1) J. R. O’Connell 1 and P. M. VanRaden 2 1 University of Maryland School of Medicine, Baltimore, MD, USA 2 Animal Genomics and Improvement Laboratory, Agricultural Research Service, USDA, Beltsville, MD, USA joconnel@medicine.umaryland.edu Strategies to choose from millions of imputed sequence variants
2
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (2) Introduction l Whole-genome sequencing costs are rapidly declining l Large numbers of animals will be sequenced in the next few years l How will WGS be used in genomic evaluations?
3
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (3) Applications of WGS data l Develop reference panels for imputation using SNP chip l Discover QTLs l Discover rare and de novo mutations with deleterious effects for surveillance l Build better SNP chips
4
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (4) WGS implementation considerations l Cost of higher density SNP chips to industry l Imputation cost for weekly/monthly genomic evaluations l Redesigning chip incurs cost of generating new data for imputation w Build off the 60K chip
5
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (5) Project goal: Prepare for WGS l Develop programs to simulate genotypes and QTLs l Develop programs to process sequence data l Compare strategies for variant selection l Predict gains in reliability l Determine computational footprint for large populations
6
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (6) Data l 26,984 HOL bulls in U.S. reference population l 112,905 animals in the pedigree l 30 million simulated variants, 10,000 QTLs l 30 equal-length chromosomes (100 Mbases) l 3 different chip densities (HD, MD, LD) l 5 independent traits (same QTL locations)
7
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (7) Genotype simulation and pruning l Simulate 30 million genotypes plus QTLs in 1000 bulls using genosim.f90 l Remove loci with MAF 0.95 l Keep 500K loci near QTLs (genes) + 600K chip + 60K chip l 8,403,858 loci remain
8
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (8) QTL effect distribution (% variance) Top QTLs Trait Avg 12345 13.7133.93.55.65.9 1020.03421.023.029.025.4 10057.06359.057.062.059.6 1,00090.09292.0 93.091.8 10,000100.0100100.0
9
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (9) QTL correlation to 8M SNP set
10
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (10) Bull genotypes: Imputation findhap.f90 BullsGenotype densityVariants 1,000Sequenced pruned to8,403,858 773High density600,000 24,863Medium density60,000 348Low density12,000 26,984Total bulls, all imputed to8,403,858 17,896Phenotyped (old)DYD 9,088Validation (young)True BV
11
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (11) Computation required Step Proc- essors Time (hours) Memory (Gbyte) Disk (Gbyte) Simulate 30M15621032 Prune linkage1012710 Impute 8M203813220
12
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (12) Study1: Select 25K from 600K chip l Compare 60K and 600K REL using same data l Choose best markers from 600K, add to 60K w Select largest 5,000 for each of 5 traits w Multiple regression: By effect size, effect variance w GWA: By p-value significance w Merge and remove duplicates, adding 23K
13
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (13) Results: REL from PA, 60K, 85K, 600K 60K + 25K TraitPA60KGWASizeVar600K 124.477.979.281.681.380.3 231.277.979.381.481.280.1 332.778.379.581.381.580.4 423.376.677.780.279.878.6 530.478.380.082.582.281.2 Avg28.477.879.181.481.280.1
14
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (14) Genome-wide association l Single SNP regression within mixed model that accounts for the polygenic effect using pedigree kinship w Y= Xb + SNP + a + e l Multiple testing adjustment leads to stringent cutoffs for significance l Focus on estimation rather than prediction
15
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (15) GWA trait 2 imputed QTLs <5% QTL < 10e-5
16
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (16) Study 2: Select sequence variants l 8M – search everywhere l 1M – search by bioinformatics using 2.5kb window on QTL l QTLs – perfect functional knowledge l Include 60K chip for imputation, but assign all or most prior variance to the selected variants w Selection of variants may affect optimal shape/ choice of prior distribution
17
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (17) REL from 1M, 60K+1M subset, QTLs 1 million near QTLsInclude QTLs Trait600K60K+25KAll 1M60K+10K10K 180.385.486.784.687.2 280.185.387.784.987.7 380.484.986.185.087.8 478.683.584.882.985.9 581.286.087.685.287.5 Avg80.185.086.484.587.2
18
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (18) Computation required Step Proc- essors Time (hours) Memory (Gbyte) Disk (Gbyte) Simulate 30M15621032 Prune linkage1012710 Impute 8M203813220 Predict 1M52220<1 Select 25K from 8M 300.5<1
19
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (19) Conclusions l Selection of variants requires several steps l Potential gains from sequence are large l Still requires very large reference population l Methods scalable to more sequences l Bioinformatics can narrow the search l GWA is scalable to larger data but multiple regression give higher reliability
20
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (20) Current directions l Evaluate several additional SNP selection criteria including priors for GWA l Apply methods to real data from 1000 Bulls Project l Evaluate bovine bioinformatics resources for functional annotation
21
Jeff O’ConnellInterbull annual meeting, Orlando, FL, July 2015 (21) Acknowledgments l JRO supported by USDA SCA 58-45-14-070-1
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.