Presentation is loading. Please wait.

Presentation is loading. Please wait.

SolCAP Solanaceae Coordinated Agricultural Project SNP Development for Elite Potato Germplasm David Douches Walter De Jong Robin Buell David Francis John.

Similar presentations


Presentation on theme: "SolCAP Solanaceae Coordinated Agricultural Project SNP Development for Elite Potato Germplasm David Douches Walter De Jong Robin Buell David Francis John."— Presentation transcript:

1 SolCAP Solanaceae Coordinated Agricultural Project SNP Development for Elite Potato Germplasm David Douches Walter De Jong Robin Buell David Francis John Hamilton Lukas Mueller AllenVan Deynze John Hamilton Lukas Mueller AllenVan Deynze Funding USDA/AFRI This project is supported by the Agriculture and Food Research Initiative Applied Plant Genomics CAP Program of USDA’s National Institute of Food and Agriculture.

2 What is SolCAP? The SolCAP project is a coordinated agricultural project that links together people from public institutions, private institutions and industries who are dedicated to the improvement of the Solanaceae crops: potato and tomato. The SolCAP project is a coordinated agricultural project that links together people from public institutions, private institutions and industries who are dedicated to the improvement of the Solanaceae crops: potato and tomato. Through innovative research, education and extension the SolCAP project will focus on providing significant benefits to both the consumer and the environment. The SolCAP project is supported by the Agriculture and Food Research Initiative Applied Plant Genomics CAP Program of the USDA’s National Institute of Food and Agriculture

3 Minnesota University of Minnesota Wisconsin USDA/ARS University of Wisconsin Michigan Michigan State University Ohio Ohio State University Lead Institution: Michigan State University Oregon Oregon State University Cedar Lake Research and Consulting Idaho USDA/ARS University of Idaho California UC Davis Campbells R&D New York Cornell University Maryland USDA/ARS Beltsville West Virginia West Virginia State University North Carolina North Carolina State University Florida University of Florida SolCAP Project Participants

4 Commercial Solanaceae Production US: $5.38 billion product value (1.6 million acres)

5 Potato Breeding Bottlenecks & Challenges Tetraploid geneticsTetraploid genetics Narrow genetic baseNarrow genetic base Small populationsSmall populations Many pestsMany pests Multi-trait evaluationMulti-trait evaluation –Quality –Resistance –Agronomy Market differentiationMarket differentiation

6 Potato Breeding Bottlenecks & Challenges Lack of markers in elite germplasmLack of markers in elite germplasm Mostly a phenotypic based process Market defining traits (CHO) difficult to select for at early generation stages. Breeder needs to combine the market-driven quality with the agronomic performance and host plant resistance needed by the growers.

7 The Potato Genome Sequencing Consortium The Potato Genome Sequencing Consortium (PGSC) have collaborated to sequence the genomes of two species: Solanum tuberosum (RH) and Solanum phureja (DM1-3 516 R44). First potato genome assembly http://www.potatogenome.net

8 The Potato Genome Sequencing Consortium Whole genome shotgun sequencing – Hybrid approach using three sequencing technologies Metrics: 850 Mb V3: 9,171 scaffolds (717.5Mb) & 58,998 contigs (9.7Mb) N50 scaffold size: 1,318,511bp N90 scaffold size: 253,760 bp Available at: Potatogenome.net

9 Annotating the Potato Genome Identified genes Sequenced transcriptome from 29 different DM tissues Analyzing the genes and their expression currently

10 In Solanaceae There is a Major Gap Between Genomic Information and Breeding Potato breeding are based upon phenotypes, not genotypes, despite the fact that they are being sequenced. Marker assisted breeding (MAB) is not widely practiced due to a lack of genetic markers linked to traits of interest.Marker assisted breeding (MAB) is not widely practiced due to a lack of genetic markers linked to traits of interest. SolCAP is providing translational genomics strategy.SolCAP is providing translational genomics strategy.

11 Primary Research Objective Primary Research Objective To reduce the gap between genomics and breeding SolCAP will provide infrastructure to link allelic variation of SNPs in genes to valuable traits. –Identify up to 10,000 SNPs for potato in elite germplasm –Combine eSNPs w/ Illumina sequence-identified SNPs –75% of the SNP’s distributed throughout the genome –25% of the SNP’s targeted to candidate genes and genetic markers –Genotype germplasm panels and mapping populations with Illumina Infinium platform

12 Develop extensive sequence data of expressed genes, and identify SNP markers associated with candidate genes for CHO and vitamin biosynthetic pathways. Collect standardized phenotypic data of panel and 4x mapping population across multiple environments for potato. Address regional, individual program and emerging needs through a small grants program that supports SNP genotyping of additional mapping populations. Create integrated, breeder-focused resources for genotypic and phenotypic analysis by leveraging existing databases and resources at SGN and MSU. Plan of Work

13 SolCAP SNP Analysis From the existing sequence databases, we have identified 7,700 potato and 5,200 tomato sequences with candidate SNP’s; these are being further validated using computational approaches cDNA Libraries for sequencing of potato and tomato using the Illumina Genomic Analyzer Genotype germplasm panels of 480 and mapping populations with Illumina, Luminex or Infinium · 75% of the SNP’s assayed will be random throughout the genome · 25% of the SNP’s assayed will be targeted to high value traits 1. Existing eSNPs from Kennebec, Bintje and Shepody ESTs 2. New Illumina GAII sequencer-identified SNPs from important processing cultivar transcriptomes: Atlantic – high solids chip-processor Snowden – low reducing sugar storage chip processor Premier Russet – low reducing sugar; frozen proc. Potato SNP Marker Discovery

14 Potato Total # of Transcript Assemblies: 70,344 Total bp length of Transcript Assemblies: 49,859,202 Total # Transcript Assemblies w/ putative SNPs: 7,722 Total bp length of Transcript Assemblies w/SNPs: 8,872,526 Total # of putative SNP positions: 57,705 In Silico Sanger Identified SNPs (eSNPs) Tomato Total # of Transcript Assemblies: 48,945 Total bp length of Transcript Assemblies: 33,916,704 Total # Transcript Assemblies w/ putative SNPs: 5,198 Total bp length of Transcript Assemblies w/ SNPs: 6,347,780 Total # of putative SNP positions: 16,531

15 Sanger-derived Potato eSNPs - Intra-varietal and inter-varietal - Bulk of sequence data from ESTs - http://solanaceae.plantbiology.msu.edu/analyses_snp.php

16 Potato Snowden Atlantic Premier Russet Tuber Leaf Flower Callus cDNA Libraries for Sequencing Using Illumina Genome Analyzer II Isolate RNA from these 4 tissues Pool in equimolar amounts Construct normalized cDNA to reduce representation of abundant transcripts

17 SNP Workflow Library creation/QC GAII sequencing (single and paired end) Data Collection 400 300 Analysis: transcriptome complexity SNP calling/validation Assembly

18 Data Analysis of Illumina cDNA Reads: Potato SampleTotal ClustersTotal Reads PF Passed Clusters % PF Passed Clusters Total PF Reads Actual Reads Atlantic 17,601,27715,202,5546,382,74883.9712,765,496 Atlantic 210,544,54221,089,0849,252,16887.7418,504,33630,185,186 Premier 17,812,39415,624,7886,652,12185.1513,304,242 Premier 211,678,37923,356,7589,999,92685.6319,999,85231,949,096 Snowden 17,996,41815,992,8366,837,55385.5113,675,106 Snowden 211,781,67123,563,34210,393,32288.2220,786,64433,288,120

19 De Novo Velvet Assemblies of Potato Illumina Sequences Minimum contig length of 150bp: Variety Total Gb Transcriptome Size (Mb) No. ContigsN50 (bp) Maximum Contig (Kb) Atlantic1.8 38.445215119211.2 Premier 1.938.2549178266.6 Snowden 2.038.2587547756.9

20 Atlantic: – 45214 contigs –32520 align with GMAP(95%id, 50%cov) –27106 align with GMAP(95%id, 90%cov) Premier: –54917 contigs –41497 align with GMAP (95%id, 50%cov) –37297 align with GMAP (95%id, 90%cov) Snowden: –58754 contigs –44479 align with GMAP (95%id, 50%cov) –40708 align with GMAP (95%id, 90%cov) Alignment of S. tuberosum GAII-transcriptome contigs to the PGSC draft genome sequence from DM1-3 516 R44: Velvet Assemblies of Potato Illumina Sequences

21 QuerySNPsFiltered SNPs Atlantic224748150669 Premier265673181800 Snowden258872166253 Identify intra-varietal SNPs A/C SNP

22 Filtered SNP counts Filtering on SNP quality and 1 SNP/ 150bp window

23 Design SNPs for the Illumina Infinium Platform SNPs from: Final SNP 10K array content selected from 69,011 SNPs that pass the filtering and design criteria for the Infinium® platform using the following criteria: -Read Depth: 20 reads min, 255 reads max -Biallelic based on all available sequence -Within exons (map to DM1-3 draft genome sequence); specifically, 50 bp from exon/intron junction -Max 1 SNP within 50 bp of candidate SNP -Preferred SNPs that were intervarietal

24 Candidate Genes For Genotyping -2009/10: a community call for genes to be placed on the potato and tomato platforms (assuming SNPs could be designed) -Had strong response by the community; web page submissions, direct solicitations, email solicitations ~ 1800 sequences were identified by project personnel and the community for this targeted SNP discovery; note: represents redundant sequences -In potato, > 700 candidate genes have a SNP that passes our filtering criteria

25 1065 candidates with no SNPs 160 candidates with 1 SNP 135 candidates with 2 SNPs 102 candidates with 3 SNPs 100 candidates with 4 SNPs 48 candidates with 5 SNPs 175 candidates with 6-10 SNPs 54 candidates with 11-31 SNPs We want up to 5 SNPs per candidate gene. SNPs found in candidate genes

26 SNPs in some key candidate genes Sucrose-phosphate-synthase20 Soluble starch synthase 3, chloroplastic/amyloplastic18 Acid invertase16 Granule-bound starch synthase 2, chloroplastic/amyloplastic10 Glucose-6-phosphate isomerase10 Sucrose sythase10 Isoamylase isoform 28 Sucrose transporter8 Beta-amylase6 Sucrose synthase6 Granule-bound starch synthase 16 Phosphoglucomutase6

27 Spacing and gene region coverage We expect approximately 25% of the SNPs will be mapped to candidate genes, 10% to SNPs from known genetic markers, and 65% to genes distributed across scaffolds, primarily those anchored to the DM1 ‐ 3 516R44 S. phureja draft genome. 2769 SNPs in candidate genes 508 SNPs in genetic markers 6723 SNPs will come from throughout the genome How much of the genome is represented? ~650 Mb of the genome will be covered (~850 Mb genome)

28 Validation High Resolution Melting Tested: 48 primers Validation (75%) Problems with technical replicates GoldenGate Bead Express 96 x 480 samples Selected 32 SNPs total per variety (96 total) Validation rate ~85%

29 Illumina Output: Good

30 PremierAtlanticSnowdenRio Grande BBBB 146713 AAAA 175621 hets 353 BBBA 133026 AABB 213128 AAAB 201119 Total hets 57777659 no data 666 nulliplex 311113 simplex 334145 duplex 213128 SNP Validation: Dosage calls

31 Cross Duplex x Duplex10 Duplex x Nulliplex24 Duplex x Simplex1122 Nulliplex x Duplex2 Nulliplex x Nulliplex28 Nulliplex x Simplex614 Simplex x Duplex11 Simplex x Nulliplex8 Simplex x Simplex11 Simplex x Het55 Total94 SNP segregation in 4x Russet Mapping population (Premier x Rio Grande) Mapping population (Premier x Rio Grande)

32 Pair-wise Comparison of SNPs Non-segregatingSegregating SNPs (%)z CrossPloidySNPs (%)III W2310-3 x Kalkaska4X22.437.640.0 MSG227-2 x Jacqueline Lee4X16.551.831.8 Atlantic x Superior4X5.951.842.4 Stirling x 12601ad14X25.937.636.5 B1829-5 x Atlantic4X11.518.869.8 BER 63 x DM1-32X79.320.70 BER 83 x DM1-32X78.821.20 84SD22 x DM1-32X46.054.00 MCR205 x DM1-32X76.723.30 DI x DM1-32X85 150 08675-21 x 09901-012X53.846.20 RH x SH2X59 410 zI = segregation not dependent on scoring dosage; II = segregation dependent on scoring dosage

33 % Heterozygosity: 96 SNPs x 96 Potato lines % Heterozygosity: 96 SNPs x 96 Potato lines Clone Percent Heterozygosity

34 Potato Panel SNP Heterozygosity

35 SNP Heterozygosity Extremes TetraploidsDiploids 80-90% Heterozygosity50-60% Heterozygosity All Red81.2C552.9 CF77154-183.584SD2256.7 CO95051-5W84.7MCD50003669.4 Snowden84.7 Atlantic85.91-10% Heterozygosity CO97215-2P/P85.9DM1-30 ber2658575.9 <50% HeterozygosityCMM6-37.8 Chunshu No427.1CMM 1T8.2 P128.2CMM2435038.2 MSL512-644.7 Inca Gold47.1 P248.2 NDSU clone 448.2

36 Potato Germplasm Panel Panel structure (350 clones)Panel structure (350 clones) –Top 50 N. American varieties –Historical varieties –Advanced US breeding lines –Non-US germplasm –Genetic stocks Population analysesPopulation analyses –Association mapping –Historical relationship –Hypothesis testing for trait associations –Parental selection –Resolve population structure Phenotypic screening for additional traits outside of SolCAPPhenotypic screening for additional traits outside of SolCAP Phenotypic evaluationPhenotypic evaluation –Key traits: specific gravity, sucrose, glucose, Vitamin C, maturity, tuber shape, tuber number, etc. –Additional traits determined by breeding community –Data curated at SGN

37 SNP comparison across potato germplasm panel: resolving population structure MSU Breeding Program varieties Group Phureja clones clusters separately from elite germplasm Wild species cluster separately from Phureja and Tuberosum

38 SNP Genotyping Consortium Potato 10K (~9100 SNPs) Illumina Infinium chip a core set of SNPs in standard germplasm panels in tomato and potato. Over 3000 genotyping samples were ordered Consortium’s efforts resulted in securing a 24% discount per sample beyond what would have been possible with one contributor ($85/sample) The barrier to entry for many institutions was lowered, as they were able to access this tool with only a 48 sample commitment. Illumina saw orders from each of the three major world regions. More SNPs?

39 SolCAP SNP Genotyping ~9100 SNPs for elite potato germplasm 2010 SolCAP Goal: 1,152 potato x 9,100 SNPs potato germplasm panel:350 4x russet mapping population: 200 2x mapping population: 160 Community SNP genotyping: 2 populations: 350

40 What makes up the Potato Germplasm Panel Phenotypic Evaluation? Clonal Study (CS)Clonal Study (CS) –250 clones –2 reps X 10 hills –OR, WI, NY Russet Mapping Population (MP)Russet Mapping Population (MP) –Rio Grande X Premier Russet –200 progeny –2 reps X 10 hills –ID, NC, MN CS MP States in blue = Participants in SolCAP

41 Potato Germplasm Panel To be field tested 2 years X 3 major environments for potato production.To be field tested 2 years X 3 major environments for potato production. Evaluation of specific gravity, glucose and sucrose, chip color, skin type, shape, vine maturity, tuber number, tuber shape, vitamin C, internal defects, bruising, anthocyanins and biotic resistances.Evaluation of specific gravity, glucose and sucrose, chip color, skin type, shape, vine maturity, tuber number, tuber shape, vitamin C, internal defects, bruising, anthocyanins and biotic resistances.

42 Genotyping the core collections will impact strategies for translation Potential translational approaches:Potential translational approaches: –1) introgression from other populations (domesticated or wild) –2) selection for coupling phase recombinants to establish linkage blocks of favorable alleles (e.g. disease resistance loci) –3) population development designed to maximize variation w/in market classes –4) association approaches –5) whole genome approaches Other translational strategies will emerge under other CAPs or through innovation in public research.Other translational strategies will emerge under other CAPs or through innovation in public research.

43 Russet 4x Mapping Population Evaluate russet mapping population traits (Yencho, Novy, Sowokinos, Thill, Gupta, Haynes) (2009-2011)Evaluate russet mapping population traits (Yencho, Novy, Sowokinos, Thill, Gupta, Haynes) (2009-2011) –Key traits: specific gravity, sucrose, glucose, Vitamin C, maturity, tuber shape, tuber number, etc. Genetic Mapping (Van Deynze, De Jong, Douches)Genetic Mapping (Van Deynze, De Jong, Douches) –Genotyping 9100 SNPs QTL Analysis (Haynes)QTL Analysis (Haynes) –Identify markers associated with key traits MAS/MAB (Marker Assisted Selection / Breeding)MAS/MAB (Marker Assisted Selection / Breeding) –Validation of QTL in additional mapping populations –Use markers in new breeding populations

44 Integrated, breeder-focused resources for genotypic and phenotypic analysis at SGN and MSU.Integrated, breeder-focused resources for genotypic and phenotypic analysis at SGN and MSU. –http://solcap.msu.edu –http://solanaceae.plantbiology.msu.edu/ –http://solgenomics.net/ Databases and Resources

45 SolCAP Education and Extension Objectives Team-taught distance-learning graduate level course in translational genomics at Cornell University Yearly workshops for breeders to integrate genotype- based breeding strategies with elite germplasm Use eXtension.org to develop a Community of Practice for plant breeders, called Plant Breeding and Genomics, across all CAPs (Barley, Wheat, Conifer, RosBreed, Bean, Onion)

46 SolCAP PAA Workshop August 15, 2010 Corvallis, Oregon Hands-on computer lab format Topics –Potato genome analysis: Robin Buell –Tetraploid QTL analysis: Christine Hackett –Use of Illumina Genome studio: Allen Van Deynze

47 PB&GWorks Web community Target audience: The practicing plant breeder. Our long-term goal is to provide: Start-to-finish examples of marker-assisted selection applications Resource pages including protocols, software tutorials, and up-to-date contact information for companies offering genetic services Improved access to genetic resources through the "breeder's toolbox" http://pbgworks.hort.oregonstate.edu/ SolCAP has created PBGworks, a web community within the eXtension.org Plant breeders, basic scientists, seed industry professionals, agricultural professionals, extension specialists and others can publish content and network.

48 Potato SNP Summary In silico Sanger eSNPs: potato: 57,705 eSNPs ~75,000 potato SNPs from 5.7 Gb of GAII transcriptome sequence (69,011 SNPs passed Infinium design) ~650 Mb of the genome will be covered by SNPs Validation suggests SNPs can be called in broader germplasm Dosage reads of SNPs will optimize SNP genotyping of 4x mapping populations Reference Sequence of DM1-3 516R44 is permitting bioinformatic optimization of pipelines rather than relying on empirical validation.

49 Germplasm Panel SNP Genotyping SSR-based genetic mapSSR-based genetic map –2 years –200 markers 17 markers/chromosome17 markers/chromosome –$5/ data point –Not dense enough for 4x mapping –Markers may be linked to traits SNP-based genetic mapSNP-based genetic map –< 1 week –9,100 markers >700 markers/chromosome>700 markers/chromosome –< 2 ¢ / data point –Dense enough for 4x mapping –Markers are in genes –Markers robust enough for broader germplasm

50 Outcomes for Breeding from SolCAP A genome-wide set of markers and bioinformatic tools accessible by breedersA genome-wide set of markers and bioinformatic tools accessible by breeders –Breeders will access germplasm for crossing based upon SNP polymorphism and linked QTL of interest –design crosses complementary for QTL and traits, and then use MAB in early generation selection.

51 Better understanding of the allelic variation influencing CHOsBetter understanding of the allelic variation influencing CHOs –Design crosses to create improved sugar and starch levels and starch quality. –Crosses designed to manipulate and select variation within existing elite populations or introgress novel alleles from wild germplasm. –More predictable and directed breeding effort for processing and fresh market traits. Outcomes for Breeding from SolCAP

52 Collaborators, OSU David Francis Matt Robbins Sung-Chur Sim Troy Aldrich Others: Michael Coe Sanwen Huang Funding USDA/AFRI This project is supported by the Agriculture and Food Research Initiative Applied Plant Genomics CAP Program of USDA’s National Institute of Food and Agriculture. Collaborators, MSU David Douches C Robin Buell John Hamilton Kelly Zarka Collaborators, Cornell Walter De Jong Lucas Mueller Joyce van Eck Collaborators, UCD Allen Van Deynze Kevin Stoffel Alex Kozic Jeanette Martins SolCAP Acknowledgments Collaborators, Oregon State Alex Stone John McQueen Roger Leigh

53 Acknowledgments: PGSC BGI-Shenzhen, China (Sanwen Huang, Ruiqiang Li, Xun Xu, Wei Fan, Peixiang Ni, Hongmei Zhu, Desheng Mu, Bicheng Yang, Jian Wang and Jun Wang); Center Bioengineering RAS, Russia (Boris Kuznetsov); Central Potato Research Institute, India (Swarup Chakrabarti, V.U. Patil, Shashi Rawat and S.K. Pandey); Chinese Academy of Agricultural Sciences, China (Sanwen Huang, Zhonghua Zhang and Dongyu Qu); University of Dundee, United Kingdom (Dan Bolser and David Martin); ENEA, Italian National Agency for New Technologies, Energy and the Environment, Italy (Giovanni Giuliano and Gaetano Perrotta); Imperial College London, United Kingdom (Gerard Bishop); International Potato Center (CIP), Peru (Merideth Bonierbale, Marc Ghislain and Reinhard Simon); Institute of Biochemistry and Biophysics (PAS), Poland (Wlodzimierz Zagorski, Jacek Hennig, Pawel Szczesny, Piotr Zielenkiewicz and Robert Gromadka); Instituto Nacional de TecnologÌa Agropecuaria (INTA), Argentina (Gabriela Massa, Leandro Barreiro and Sergio Feingold); Instituto de Investigaciones Agropecuarias (INIA), Chile (Boris Sagredo, Alex Di Genova and Nilo MejÌa); Michigan State University, USA (Robin Buell, David Douches, Steven Lundback, Alicia Massa, and Brett Whitty); New Zealand Institute for Plant & Food Research, New Zealand (Jeanne Jacobs, Mark Fiers and Susan Thomson); Scottish Crop Research Institute, United Kingdom (Glenn Bryan, David Marshall, Robbie Waugh and Sanjeev Kumar Sharma); Teagasc Agriculture and Food Development Authority, Ireland (Dan Milbourne, Istvan Nagy and Marialaura Destefanis); Universidad Peruana Cayetano Heredia, Peru (Gisella Orjeda, Frank Guzman, Michael Torres, Tomas Miranda, German de la Cruz, Roberto Lozano and Olga Ponce); University of Wisconsin, USA (Jiming Jiang and Marina Iovene); Virginia Polytechnic Institute & State University, USA (Richard E. Veilleux); Wageningen University, The Netherlands (Bas te Lintel Hekkert, Christian Bachem, Erwin Datema, Jan de Boer, Richard Visser, Roeland van Ham, Theo Borm and Xiaomin Tang) Funding at MSU for potato genomics: National Science Foundation

54 Visit us at http://solcap.msu.edu/

55 EXTRAS

56 Single-nucleotide polymorphism (SNP, pronounced snip) SNP is a DNA sequence variation occurring when a single nucleotide — A, T, C, or G — in the genome differs between members of a species SNPs may fall within coding sequences of genes, non-coding regions of genes, or in the intergenic regions between genes. SNPs within a coding sequence may or may not change the amino acid sequence of the protein that is produced. What is a SNP?

57 Hawkeye Viewer – Visualizing SNPs G/T SNP


Download ppt "SolCAP Solanaceae Coordinated Agricultural Project SNP Development for Elite Potato Germplasm David Douches Walter De Jong Robin Buell David Francis John."

Similar presentations


Ads by Google