Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Comparative analyses of the potato and tomato transcriptomes David Francis, AllenVan Deynze, John Hamilton, Walter De Jong, David Douches, Sanwen Huang,

Similar presentations


Presentation on theme: "1 Comparative analyses of the potato and tomato transcriptomes David Francis, AllenVan Deynze, John Hamilton, Walter De Jong, David Douches, Sanwen Huang,"— Presentation transcript:

1 1 Comparative analyses of the potato and tomato transcriptomes David Francis, AllenVan Deynze, John Hamilton, Walter De Jong, David Douches, Sanwen Huang, and C. Robin Buell Supported by the AFRI Plant Breeding, Genetics, and Genomics Program of USDA’s National Institute of Food and Agriculture

2 2 Questions International Sol Project: How can a common set of genes/proteins give rise to such a wide range of morphologically and ecologically distinct organisms? SolCAP: How can variation be harnessed to improve varieties that benefit the consumer, processors, and the environment? Sequence data available to address these questions: S. phureja draft genome sequence S. tuberosum, S. lycopersicum, S. pimpinellifolium GAII transcriptomes Technology Next Generation Sequencing SNP genotyping

3 3 What comparisons do we want to make? How well do S. tuberosum expressed sequences align to S. phureja genomic sequences? How well do S. lycopersicum expressed sequences align to S. phureja genomic sequences? How is variation distributed within a Species? within a market class? within a variety? within a gene? Which sequence variation is important to phenotypic variation?

4 Library creation/QC GAII sequencing (single and paired end) Data Collection Assembly 400 300 Analysis: transcriptome complexity SNP calling/validation identification of genes under selection

5 SampleTotal Clusters Total PE Reads PF Passed Clusters % PF Passed Clusters Total PE PF Reads Actual PE Reads Atlantic 17,601,27715,202,5546,382,74883.9712,765,496 Atlantic 210,544,54221,089,0849,252,16887.7418,504,33630,185,186 Premier 17,812,39415,624,7886,652,12185.1513,304,242 Premier 211,678,37923,356,7589,999,92685.6319,999,85231,949,096 Snowden 17,996,41815,992,8366,837,55385.5113,675,106 Snowden 211,781,67123,563,34210,393,32288.2220,786,64433,288,120 Illumina GA II Output for Potato

6 Velvet Assemblies of Potato Illumina Sequences With a minimum kmer of 31 and a minimum contig length of 150bp: Variety Total Gb Transcriptome Size (Mb) No. ContigsN50 (bp) Maximum Contig (Kb) Atlantic1.838.44521566611.2 Premier1.9 38.2549174086.6 Snowden2.038.2587543586.9

7 Velvet Assemblies of Potato Illumina Sequences Atlantic: 45214 contigs 32520 align with GMAP(95%id, 50%cov) 27106 align with GMAP(95%id, 90%cov) Premier: 54917 contigs 41497 align with GMAP (95%id, 50%cov) 37297 align with GMAP (95%id, 90%cov) Snowden: 58754 contigs 44479 align with GMAP (95%id, 50%cov) 40708 align with GMAP (95%id, 90%cov) Alignment of the S. tuberosum GAII-transcriptome contigs to the PGSC draft genome sequence from S. phureja :

8 Tomato Illumina GA II Output Variety Insert Size Read LengthTotal ReadsPF Reads%PF PassedTotal PF FL760030061/4722,491,30420,685,34292.0 FL76003006016,025,97614,382,57789.8 FL76003006015,645,16413,985,87589.449,053,794 NC8417335061/6127,079,94622,687,62683.8 NC841733506011,058,43110,366,81193.8 NC841733506014,401,24012,687,13488.152,539,617 OH924235061/4726,960,89824,874,21892.3 OH92423506010,316,7759,671,75393.8 OH92423506014,676,81412,879,81287.851,954,487 T535061/4726,799,94424,677,30292.1 T53506016,822,63914,738,35187.6 T53506015,726,25713,744,51187.459,348,840 PI11449035061/4717,721,22616,422,84292.7 PI1144903506017,115,34914,902,67287.1 PI1144903506017,890,64915,248,58785.252,727,224 PI21281635061/4717,631,90616,450,42293.3 PI2128163506018,238,17915,354,88284.2 PI2128163508421,829,62218,500,23584.857,699,707

9 Variety Total Gb Transcriptome Size (Mb) No. ContigsN50 (bp) Maximum Contig (Kb) FL7600 2.8239.859,58142412.1 NC84173 2.7739.260,53449613.3 OH9242 2.7039.159,05147611.6 T5 3.0440.660,03163214 PI114490 2.704161,31069011.7 PI212816 3.0041.166,11847114 Velvet Assemblies of Tomato Illumina Sequences With a k-mer length of 31 and a minimum contig length of 150bp:

10 Sequence quality: Viewing an Atlantic potato contig from the Velvet assembly

11 FL7600 (93.7 % id; 94.4 % coverage) Snowden (97.9; 94.7) Alignment of contigs relative to S. phureja

12 QuerySNPsFiltered SNPs Atlantic Asm224748150669 Premier Asm265673181800 Snowden Asm258872166253 Identify intra-varietal SNPs A/C SNP

13 Filtered SNP counts RefQuery d 10 d 20 d 30 d 40 d 50 d 60 d 100 atlantic 213361750914493121501027786734435 atlanticpremier217891805015084124771058489194620 atlanticsnowden19997165181369411378968980484173 premieratlantic21117170961410611785979082224228 premier 229511843115016123771030087034371 premiersnowden20972168461370911357947978734113 snowdenatlantic20777169981398411619964781314186 snowdenpremier221011788814701120681012486504223 snowden 21083169631379211218935977353896 Filtering on SNP quality and 1 SNP/ 150bp window

14 Genotyping platforms…. Comments on quality control… Data…. direct comparison of sequence analysis of SNPs across populations

15 COS R-gene Comparison of two genes on tomato chromosome 9 BAC

16 COSII Fresh Market vs Fresh Market Identities = 573/573 (100%), Gaps = 0/573 (0%) Fresh Market vs Processing Identities = 569/569 (100%), Gaps = 0/569 (0%) S. lycopersicum vs S. pimpinellifolium Identities = 339/341 (99%), Gaps = 0/341 (0%) Potato vs Potato Identities = 606/612 (99%), Gaps = 0/612 (0%) Tomato vs Potato Identities = 914/948 (96%), Gaps = 6/948 (0%)

17 DIVERGED SEQUENCE Fresh Market vs Fresh Market Identities = 959/959 (100%), Gaps = 0/959 (0%) Fresh Market vs Processing Identities=1560/1560(100%), Gaps=0/1560 (0%) S. lycopersicum vs S. pimpinellifolium Identities = 612/613 (99%), Gaps = 0/613 (0%) Tomato vs Potato Identities = 223/280 (79%), Gaps = 11/280 (3%) Potato vs Potato Identities = 246/278 (88%), Gaps = 7/278 (2%)

18 What patterns do we expect to see for genes “under selection”? Low Variation (fixed) High Ka/Ks (mutations affect protein, possible diversifying selection) Mutations (loss of function) F ST (genes that distinguish populations)

19

20 All 173 markers (K=6) 89 Coding markers (K=5) 84 Non-coding markers (K=6) ProcessingFresh-marketVintageLandrace 500K burnin/750K MCMC reps, 20 runs for each K from 3 to 8 Population structure: coding vs. non-coding CA & OHOH CA OH CN

21 Distribution of F ST for genes ovate: 0 fw2.2: 0 sp6: 0.14 ovate: 0.26 fw2.2: 0 sp6: 0.73 ovate: 0.31 fw2.2: 0 sp6: 0.47 ovate: 0 fw2.2: 0.5 sp6: 1 ovate: 0 fw2.2: 0.42 sp6: 0.74 ovate: 0.14 fw2.2: 0.46 sp6: 0.05

22 Examples of highly polymorphic genes within S. lycopersicum Note: I am working on a replacement that compares Ka/Ks for selected tomato and potato genes

23 Examples of highly polymorphic genes within S. lycopersicum Note: I am working on a replacement that compares Ka/Ks for selected tomato and potato genes

24 Processing Fresh Market Vintage Wild Distribution of PM genes across populations is not random

25 Conclusions ~5.7 Gb PF potato transcriptome sequence (3 varieties) ~14.3 Gb PF tomato transcriptome sequence (6 varieties) S. phureja draft genome is an excellent scaffold for potato and tomato GAII transcriptome alignments SNPs are not evenly distributed in genes Genes with signatures of selection (Ka/Ks; high F ST ) tend to be genes associated with response to abiotic and biotic stress. Breeders have selected for groups of genes suggesting that co-adapted complexes

26 Acknowledgments Collaborators, OSU Matt Robbins Sung-Chur Sim Troy Aldrich Collaborators, Cornell Walter de Jong Lucas Mueller Joyce van Eck Collaborators, CAU Wencai Yang Collaborators, CAAS Sanwen Huang Collaborators, UCD Allen Van Deynze Kevin Stoffel Alex Kozic Funding USDA/AFRI Collaborators, MSU David Douches C Robin Buell John Hamilton Kelly Zarka


Download ppt "1 Comparative analyses of the potato and tomato transcriptomes David Francis, AllenVan Deynze, John Hamilton, Walter De Jong, David Douches, Sanwen Huang,"

Similar presentations


Ads by Google