Presentation on theme: "GBS & GWAS using the iPlant Discovery Environment"— Presentation transcript:
1 GBS & GWAS using the iPlant Discovery Environment @ Plant & Animal Genome XXI - San Diego, CA
2 How can we determine genotypes using sequencing technology? Overview: This training module is designed to demonstrate the Genotype by Sequencing Workflow and Genome Wide Association Study using a Mixed Linear ModelQuestions:How can we determine genotypes using sequencing technology?How can we find genetic variants (e.g. SNPs) associated with a phenotype?
3 Tools for Statistical Genetics in the DE PurposeGenotype by Sequencing WorkflowAutomatic pipeline for extracting SNPs from GBS data (with genome from user or from iPlant database)UNEAK pipelineAutomatic pipeline for extracting SNPs from GBS data without reference genomesMLM workflowAutomatic workflow for fitting Mixed Linear ModelGLM workflowAutomatic workflow for fitting General Linear ModelQTLC workflowAutomatic workflow for composite interval mappingQTL simulation workflowAutomatic workflow for simulating trait data with given linkage mapPLINKPLINK implementation of various association modelsZmapqtlInterval mapping and composite interval mapping with the options to perform a permutation testLRmapqtlLinear regression modelingSRmapqtlStepwise regression modelingAntEpiSeekerEpistatic interaction modelingRandom JungleRandom Forest implementation for GWASFaST-LMMFactored Spectrally Transformed Linear Mixed ModelingQxpakVersatile mixed modelinggluH2PConvert Hapmap format to Ped formatLDLinkage Disequilibrium plotStructureEstimation of population structurePGDSpiderData conversion toolGLMstrucutreGLM with population structure as fixed effect
4 Elshire et al. PLoS One May 4;6(5):e doi: /journal.pone
5 Genotype By Sequencing Ed Buckler (Cornell University)Elshire et al. PLoS One May 4;6(5):e doi: /journal.pone
23 “Genotype By Sequencing Workflow” in DE Individual steps strung together to run with a single clickSome steps merged to reduce I/O
24 GBS Workflow Output in the DE Final filtered hapmap files in folder “filt”
25 Final Notes on GBS If you do not have a reference genome: -- use “UNEAK” (also part of TASSEL)If your reference genome is not support by the DE:-- use “GBS Workflow with user genome”
26 MLM Pipeline for GWASMixed Linear Model alternative to General Linear Model:Reduces false positives by controlling for population structureUses compression to decrease effective sample sizeP3D protocol to eliminate need to re-compute variance componentsSpeeds compute time up to ~7500x faster than GLMEd Buckler (Cornell University)TASSELmarkertraitfilterconvertimputeKGLMMLMZhang et al. Nature Genetics. 2010; doi: /ng.546
27 MLM Input Files Hapmap file Phenotype data Kinship matrix* traitsstrainHapmap filePhenotype dataKinship matrix*Population structure*Population structure3 populations sum to 1strain* Kinship matrix & population structure data can be generated using TASSEL or with “MLM Workflow” App in DE
28 MLM Output MLM1.txt MLM2.txt MLM3.txt See TASSEL manual for details: Marker“df” degrees of freedom“F” F distribution for test of marker“p” p-value“errordf” df used for denominator of F-testetc.MLM2.txtEstimated effect for each allele for each markerMLM3.txtThe compression results shows the likelihood, genetic variance, and error variance for each compression level tested during the optimization process.See TASSEL manual for details:
Your consent to our cookies if you continue to use this website.