Greg Challis Department of Chemistry Lecture 1: Methods for in silico analysis of cryptic natural product biosynthetic gene clusters Microbial Genomics and Secondary Metabolites Summer School, MedILS, Split, Croatia, June 2007
Overview Introduction cryptic (orphan) gene clusters in microbial genomes Clusters encoding nonribosomal peptide synthetases (NRPSs) domains, modules, substrate specificity, predicting products Clusters encoding modular polyketide synthases (PKSs) domains, modules, substrate specificity, predicting products Clusters encoding other biosynthetic systems terpene synthases, iterative PKSs
Introduction
‘Cryptic’ (orphan) biosynthetic gene clusters Present in many of the 300 or so sequenced microbial genomes e.g. Streptomyces avermitilis Streptomyces coelicolor Bacillus subtilis Pseudomonas fluorescens Pseudomonas syringae Nostoc punctiforme Aspergillus nidulans May prove a valuable new source of bioactive metabolites Polyketide synthases Nonribosomal peptide synthetases Terpene synthases
Genome sequence of the model antibiotic- producer Streptomyces coelicolor M145
Gene clusters directing complex metabolite biosynthesis in the S. coelicolor genome Bentley et al. Nature (2002) 417,
Part 1: Nonribosomal peptide synthetase analysis
Recap of NRPS organisation and function: the gramicidin S synthetase as an example AECAAACCCATE module 1 module 2 module 3 module 4 module 5 grsAgrsBgrsT synthetase 1synthetase 2 PCP A = Adenylation PCP = peptidyl carrier protein C = Condensation E = Epimerisation TE = Thioesterase
Recap of NRPS organisation and function: the gramicidin S synthetase as an example TE PCP For further information see Lars Robbel’s poster
Nonribosomal peptide synthetases encoded by the S. coelicolor genome
A new S. coelicolor NRPS gene cluster cchAcchBcchH Flavin-dependent monooxygenase (cchB) Non-ribosomal peptide synthetase (cchH) Formyl-tetrahydrofolate-dependent formyl transferase (cchA) MbtH-like protein (cchK) Esterase (cchJ) Challis and Ravel FEMS Microbiol. Lett. (2000) 187, Export functions Ferric-siderophore import cchJcchI
Prediction of domain and module structure Conserved Domain (CD) search ( Deduced domain and module organization
Prediction of A-domain selectivity pocket residues GrsA DASVWEMFMALLTGASLYIILKDTINDFVKFEQYINQKEITVITLPPTYVVHL-----DPERILSIQTLITAGSATSPSLVNKWKEK--VTYINAYGPTETTI Ncs1-M1 DIAVWELLAAFVGGARLVIAEHRLRGVVPHLPELMTDHRVTVAHFVPSVLEELLGWMADGGRVG-LRLVVCGGEAVPPSQRDRLLALSGARMVHAYGPTETTI GrsA D A W T I A A I Ncs1-M1 D I W H V G A I Stachelhaus, Mootz and Marahiel Chem. Biol. (1999) 6, Challis, Ravel and Townsend Chem. Biol. (2000) 7,
Empirical correlation between specificity pocket residues and substrate Challis, Ravel and Townsend Chem. Biol. (2000) 7,
Prediction of substrates and possible products for the S. coelicolor cryptic NRPS Challis and Ravel FEMS Microbiol. Lett. (2000) 187,
Part 2: Modular polyketide synthase analysis
Three large modular enzymes (DEBS 1- 3), encoded by eryAI, eryAII, and eryAIII, assemble 6-DEB Each module performs one chain extension Recap of modular PKS organisation and function: the erythromycin synthase as an example
-CO 2
Three large modular enzymes (DEBS 1- 3), encoded by eryAI, eryAII, and eryAIII, assemble 6-DEB Each module performs one chain extension Recap of modular PKS organisation and function: the erythromycin synthase as an example
Gene clusters directing complex metabolite biosynthesis in the S. coelicolor genome Bentley et al. Nature (2002) 417,
A new S. coelicolor modular PKS cluster Genes encoding a modular PKS
Prediction of domain and modules in CpkA Conserved Domain (CD) search (
Prediction of domain and modules in CpkB
Prediction of domain and modules in CpkC
Prediction of domains and modules in CpkABC Pawlik, Kotowska, Chater, Kuczek and Takano Arch. Microbiol. (2007) 187, 87-99
Prediction of AT domain substrate selectivity Haydock et al. FEBS Lett. (1995) 374, Banskota et al. J. Antibiot. (2006) 59,
Prediction of KR domain stereoselectivity
Caffrey ChemBioChem (2003) 4, Reid et al. Biochemistry (2003) 42, 72-79
Prediction of substrates and possible products for the S. coelicolor cryptic PKS
Non-linear enzymatic logic can complicate things! Haynes and Challis, Curr. Op. Drug Discov. Develop. (2007) 10,
Non-linear enzymatic logic can complicate things! Haynes and Challis, Curr. Op. Drug Discov. Develop. (2007) 10,
Part 3: Analysis of other biosynthetic systems
Terpene synthases
Iterative polyketide synthases – type III PKSs
Conclusions Reasonably confident in silico predictions of domain / module organisation and substrate specificity of modular PKS / NRPS can be made Non-linear enzymatic logic can complicate the reliable prediction of product structure(s) For other types of biosynthetic system, reasonably confident predictions of substrate specificity can sometimes be made Prediction of chain length and substrate specificity in some iterative PKS systems, especially type III and fungal type I, remains difficult