Presentation on theme: "Comparative Genomics of Aspergilli William Nierman TIGR."— Presentation transcript:
Comparative Genomics of Aspergilli William Nierman TIGR
Electrophoretic Karyotyping 5 day run CHEF DRII 1.2% CGA, 1x TAE, 14 C, 1.8 V/cm: 2200 s, 48 h; s, 68 hsizes in Mb Sc SpSp 5.0 1x Af
A. fumigatus Chromosomes Centromeric area Telomere Size (MB) ~35 copies rDNA
Centromeres and Telomeres Telomere repeat TTAGGG, 7-21 repeat units –Subtelomeric regions- identical sequences for several kb, helicase pseudogenes, 7 secondary metabolite clusters, niche adaption role? (Mark Farman) Centromeres –Uncloned in shotgun libraries; kb –Flanked on each side by low complexity AT rich repeat region –Chromosome 2 centromere 12 kb PCR product 75% AT, overall centromeric AT of 63%, 40kb.
Finished chromosome sequences Masked genomic sequence Gene predictionProtein alignments EST alignments Optimize Predictions Eukaryotic Genome Control (EGC) is the annotation pipeline responsible for processing genomic sequence Annotation Pipeline
Training Data –Full Length cDNAs (625) and 42 partials from 589 loci in 19 Aspergillus species –2,633 A. fumigatus ESTs from UK and Spanish collaborators Gene and splicing site predictions including Glimmer,Exonomy, Unveil, Phat and GeneSplicer were trained with following experimental data:
Optimize Predictions Combiner combines gene model evidence from : Gene prediction programs Splice site prediction programs Alignments from protein, cDNA and EST databases Generates final gene model. All the genes were manual reviewed and the observed splits and merges were corrected.
Annotation Station Screenshot Brown 2 Brown 1 Yellowish-green 1,3,6,8-tetrahydroxynaphthalene reductase Scytalone dehydratase Polyketide synthetase
ChromosomeAFUANAAOA Size GC Content # of Genes Mean Gene Length Gene Density Percent of Coding Percent Genes with Introns ExonsAFUANAAOA Number Mean # per Gene GC Content Mean Length(bp) Total Length(bp) IntronsAFUANAAOA Number GC Content Mean Length(bp) Total Length(bp) Intergenic RegionsAFUANAAOA GC Content Mean Length(bp) Functional AnnotationAFUANAAOA # of Genes w/PFAM Hits # of Genes with Computed Families Gene Summary Statistics
Synteny Map of A. fumigatus, A. nidulans, A. oryzae
The ortholog was computed by performing an all vs. all BlastP of the three proteomes with a cut-off of 1 x e-15 (no length requirement). The mutual best hits were then organized into clusters based on shared protein nodes. COGA. fumigatusA. OryzaeA. nidulansavg_pctidavg_coveragenum_cogs 3 member+++70%86% %84%967 2 member+ +61%79% %80%936 Species#genes included in COGpercent of predicted proteome A. fumigatus750779% A. nidulans742975% A. Oryzae798857% Total %(22924/33552) Overview – Comparative Statistics
TIGR Autoannotation vs Sanger Curated Annotation StatusCount Total Sanger Genes analyzed360 Same gene structure137 Different gene structure177 Sanger missing in TIGR annotation37 Sanger matches multiple TIGR annotations2 Sanger, TIGR annotations opposite strands7 TIGR missing in Sanger annotation12 TIGR matches multiple Sanger annotations9
Using Ortholog Clusters to Identify Potential Annotation Problems
Using Ortholog Clusters to Identify Potential Annotation Problems Different exon number due to annotation discrepancy
We need to be able to distinguish annotation inconsistencies from real, interesting phenomena In some cases, differences in exon number are real
Apoptosis in Fungi Apoptosis-like process detected in S. cerevisiae, S. pombe, and Aspergilli. Fungal genomes lack metazoan upstream machinery. Metacaspase-dependent phenotype observed in A. fumigatus and A. nidulans. Analysis by Goeff Robson
DOMAINS S.cerevisiaeS.pombeA.fumigatusA.nidulansA.oryzae NB-ARCXX57.m m m m04653 asfu m m m m m m m m m m m m m m m m00299 Caspase-activated nuclease XXXXX CAS/CSECSE-1 XXX MATHUBPF UBP5 53.m m m m00277 PROTEIN FAMILY MetacaspaseMCA1AL m m m m m m m m00321 Anti silencing protein1ASF1 59m m m STM1STM1/MPT4Q42914XXX CDC48pCDC48 72.m m m00118 Apoptosis in Fungi
Aspergillus fumigatus Secondary Metabolites Heterogeneous group of low molecular weight products. Toxic, antibiotic, and immunosuppressant activities. –– fumagillin, gliotoxin (apoptosis and phagocyte dysfunction), fumitremorgin, verruculogen, fumigaclavine, helvolic acid, phthioc acid (granulomas when injected into mice) and sphingofungins Virulence properties may be augmented by the A. fumigatus numerous secondary metabolites.
Gene typeA. oryzaeA. fumigatusA. nidulans PKS NRPS1814 FAS516 Sesquiterpene cyclase 1(1) DMATS272 Secondary Metabolite Genes Analysis by G. Turner, N. Keller, Dr. Kitamoto, and R. Kulkarni
A. fumigatus Secondary Metabolite Genes Few true orthologues across the genus Aspergillus. Each species has its own repertoire. Gene/product relationship requires functional analysis in most cases Indole alkaloid pathway in A. fumigatus only. Closely related to Claviceps purpurea ergotamine pathway Penicillin and aflatoxin pathways are absent. A hybrid PKS/monomodular NRPS seems to be present in several fungi.
Identify A. fumigatus specific genes A. fumigatus genes All vs. all BlastP of the AFU1,ANA1, AOAN proteomes cut-off E value: 1 x e-15, filtering the results for mutual best hits between genomes. A. fumigatus singletons (9746) (2075) BLASTP vs ANA1 and AOA1 proteomes A. fumigatus singletons E-value > e-10 (1081) Extend 50bp on both ends of the gene in the genome, Tblastx the genomic seq of the gene vs ana and aoa genomic seq A. fumigatus specific gene candidates E-value > e-50 e-5>E-value>e-10 (203) BLASTP vs ANA1 and AOA1 proteomes E-value > e-5 (808) e-50E-value>e-10 (75) E-value > e-5 (552) (1011) Extend 50bp on both ends of the gene in the genome, Tblastx the genomic seq of the gene vs ana and aoa genomic seq
Aspergillus fumigatus Unique Genes Vast majority are hypothetical Includes –Several transcriptional regulators –A chaperonin –An hsp 70 related protein
Arsenic Fungi 19 th century poisonings associated with green pigments B. Gosio, certain fungi could metabolize arsenic pigments producing toxic trimethylarsine (Gosio gas). Screen in the 1930s (Thom & Raper) found A. fumigatus to be an arsenic fungus. Napoleon, imperial colors green and gold, copper arsenite (Jones 1982). Analysis of history and genome by J. Bennett, N. Hall, J. Wortman, C. Lu.
A. Fumigatus Teichoic Acid Biosynthesis Protein Good homology to a the full length of the Streptomyces griseus protein. Secretion signal peptide may direct for cell wall. Teichoic acids demonstrated to be a virulence factor for Staphylococcus aureus. No intervening sequences in gene. Analysis by Neil Hall
More highly expressed at 48 o C More highly expressed at 37 o C A. Fumigatus Thermotolerance
A. fumigatus Thermotolerance Relatively few genes altered Some HSPs transiently or stably induced (weakly) and repressed at 37 o C. HSPs induced throughout 180 min 48 o C period Transposases induced at 48 o C (Mariner 4). Stress related genes up regulated at 48 o C. Metabolic proteins down regulated at 48 o C This fungus likes it hot. J. Bennett
Microarray Detection of Clusters
Aspergillus fumigatus AF293 Project Participants The University of Manchester, UK The Wellcome Trust Sanger Centre, UK The Institute for Genomic Research, USA The University of Salamaca, Spain Complutense University, Spain Centro de Investigaciones Biológicas, Spain
Aspergillus fumigatus AF293 David Denning Michael Anderson Arnab Pain Goeff Robson Javier Arroyo Goeff Turner David Archer Joan Bennett Matt Berriman Jean Paul Latge Paul Dyer Paul Bowyer Neil Hall Aspergillus nidulans – James Galagan Aspergillus oryzae – Masayuki Machida
TIGR Sequencing and Closure Tamara Feldblyum Hoda Khouri Annotation Jennifer Wortman Jiaqi Huang Resham Kulkarni Natalie Fedorova Charles Lu Claire Fraser Lab Group Heenam Kim Dan Chen NIAID and Dennis Dixon