Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite.

Similar presentations


Presentation on theme: "Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite."— Presentation transcript:

1 Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite gene clusters Dallery et al. 2017

2 Colletotrichum higginsianum
Max Planck Institute Pathogenic fungus Affects brassica crops, such as Arabidopsis thaliana, in tropical and subtropical regions Important model pathosystem for looking at molecular basis of fungal pathogenicity and host response O’Connell et al. 2012

3 Rationale Affects crop yields
Previous genome assembly was highly fragmented Looking for role of transposable elements (TEs) in gene and genome evolution Better understanding of genome structure of pathogenic fungi

4 Methods 10 μg genomic DNA → ~20kb size-selected library
Sequenced on PacBio RS II platform De novo assembly with the Hierarchical Genome Assembly Process (HGAP) approach Reads filtered for min. 500bp length Genome consensus sequence polished with Quiver Assembly validated w/ PCR Illumina sequencing w/ 100 bp paired-end reads Used only to detect sequence polymorphisms

5 Even More Methods REPET pipelines to detect and classify TEs and simple sequence repeats Analysis of repeat-induced point mutations Gene predictions with MAKER2, SNAP, Augustus from Illumina reads Functional annotations via BLASTp and Blast2GO and predictions from SMURF, antiSMASH v.3.0, SMIPS, and CASSIS Phylogenetic analysis of secondary metabolism key genes (MEGA6 and Treedyn)

6 Final Methods Slide Analysis of distance of TEs to genes and gene clusters Segmental duplication analysis (SDDetector w/ PacBio unitigs) Transcriptome analysis (previous RNA-Seq data) Basically a bunch of experimental validation of transcriptome data

7 Genome Assembly 7.8 Gb of raw sequence reads
92,834 error-corrected reads N50 length 16,193 bp Final edited assembly = 28 unitigs (unitigs = high confidence contigs) 12 largest unitigs = chromosomes, 99.14% genome assembly Total length = Mb Not actually gapless = gap on Chr 7 (liars)

8 Genome Assembly Genome assembly compared to previous 2009 assembly
Assembly Statistics 2012 Assembly 2017 Assembly PacBio read coverage _ 133x Sanger read coverage 0.2x Illumina read coverage 76x 454 read coverage 25x Genome physical size 53.35 Mb Assembly length 49.05 Mb 50.72 Mb Alignable sequence 77.14 kb 50.38 Mb Number of contigs 10, 259 28 Largest contig 49.23 kb 6.04 Mb N50 contig length 6.15 kb 5.20 Mb Complete genes 2946 (79%) 3616 (97%)

9 Results 2699 MAKER2 genes match to previous gene models
2289 new genes w/ no match in previous annotation Includes 132/133 genes on Chr 12

10 Results Mini chromosomes 11 & 12 have half the gene content of the ‘core’ chromosomes Lower gene expression Much higher TE content

11 Results

12 Results Secondary Metabolite (SM) Gene Clusters

13 Results Genes in SM clusters + genes encoding candidate secreted effector proteins were found significantly closer to TEs than random genes over whole genome Many copies of large TEs

14 Results Found 6 segmental duplications
4 of these are at chrom ends and/or regions of highly similar repeats

15 Actually Cool Results Some TE families subject to Repeat-Induced Point (RIP) mutations Occurs during meiosis (sexual reproduction) This fungus is asexual RIP occurred either during ancestral sexual state or there is cryptic meiosis happening ~30% TEs appear active 60% of expressed SM clusters only during plant infection

16 Conclusions A complete genome assembly is key to analysis of TEs, teleomeres, structural rearrangements, and large gene clusters The mini-chromosomes differ dramatically from the core genome in gene and repeat content Resemble conditionally dispensable chroms. Pathogenicity-related (?) genes Repeat-mediated segmental duplication likely accelerated the pathogenicity-related gene evolution, e.g. ectopic recombination SM gene cluster inventory will help to ID novel bioactive molecules and their biosythetic pathways

17 Questions Would the Illumina library have helped the genome assembly if it had been included? Are unitigs as good as scaffolds when used in their place? If the fungus lost its mini-chromosomes, would it be significantly less pathogenic?


Download ppt "Gapless genome assembly of Colletotrichum higginsianum reveals chromosome structure and association of transposable elements with secondary metabolite."

Similar presentations


Ads by Google