Presentation on theme: "DISCOVERY OF EXTREMOZYMES IN METAGENOMIC SEQUENCES RITA DESAI School of Informatics, IUB Capstone Presentation, May 22, 2009 Advisors : Yuzhen Ye and Sun."— Presentation transcript:
DISCOVERY OF EXTREMOZYMES IN METAGENOMIC SEQUENCES RITA DESAI School of Informatics, IUB Capstone Presentation, May 22, 2009 Advisors : Yuzhen Ye and Sun Kim
Overview Extremozymes & Metagenomics Challenges and bottlenecks… Flow chart of the computational tool Results so far.. Future prospects
Why Extremozymes? What are extremozymes? – Enzymes isolated from organisms inhabiting unconventional ecosystems (Biotechnology (N Y). 1995 Jul;13(7):662-8) Extremozymes expand the limits of biocatalysis – The information acquired from the study of extremozymes makes it possible to modify enzymes to improve their ranges of stability and activity (for industrial and medical applications) However, the organisms living in extreme environments are hard to be cultured, so they are less well studied as compared to other organisms. – Vast majority of microbes uncultured –>99% of soil organisms; >50% in human gut; >99.9% in seawater and thus cannot understand the community as a whole. Metagenomics enables sequencing of an entire microbial community without the need to culture them
Why Metagenomics? Metagenomics as concept and tool?? the genomic analysis of an assemblage of organisms “meta”= Greek for transcending; more comprehensive. Metagenomics constitute a challenging domain to discover new enzymes from diverse niches. Two early metagenomic projects: Acid Mine drainage project and Sargasso sea metagenomic survey.
Bottlenecks and Challenges Bottlenecks – No robust metagenomic screening methods (experimental methods) to directly retrieve enzymes of interest. Low biomass yields and low cell number that hinder cloning. Also, Experimental methods are exhaustive and expensive. Challenges – Over 3.3 million non redundant protein sequences (up to 40% being hypothetical) have so far been predicted and deposited in electronic databases and only 8% correspond to extremophiles. Many more enzymes still need to be discovered.
Goals The objective is to discover novel extremozymes in metagenomic sequences that may exhibit unique sequential and structural features.
Research Design Collecting known extremozymes from the literature – Ref: Ferrer M., Golyshina O., Beloqui A., Goylshin P. Mining enzymes from extreme environments (2007). Current opinion in Microbiology 10:207-214. Homolog search of extremozymes – Search against IMG and IMG/M databases by BLAST – Multiple sequence alignment using CLustalW tool –hits with cutoff of E values smaller than 10 -20. Molecular modeling – Homology modeling using Modeller to predict 3D structures
Examples Of Extremozymes Identified In Metagenomic Sequences EnzymeSample and IDIdentity Esterase Acid mine drainage(2001201141) 92 Soil(2001288613) 33 Human gut (2004033831) 34 CatalaseSludge (2000613520) 83 Whalefall (2001431860) 78 Isocitrate dehydrogenaseWhalefall (2001496341) 73 Sludge(2000495850) 67 Threonine dehydrogenaseSludge (2000145560) 44 Uranium (2007096904) 43
Esterase Esterase belong to various classes: family II with motif GDSL, serine hydrolases with motif GXSXG, and family VIII with motif SXXK. Serine hydrolase family II: characteristic motif Gly-X-Ser-X-Gly, catalytic triad Ser-119, Asp-248 and His-276. Ref : Olga V., Golyshin P., Timmis K., Ferrer M.,The pH anomaly of intracellular enzymes of Ferroplasma acidiphilum.(2006) Environmental Microbiology, 8(3) : 416-425
The structures were modeled using known structure (PDB ID 1EVQ) (39%identity) by modeller, in which the active sites (ser156, asp251, his281) are conserved. Structural Modeling Predicted esterase from extremophile ferroplasma acidiphilum Predicted esterase from AMD (ID: 20012011141) with 92% identity
Motif Analysis Motif gly-X-Ser-X-Gly, characteristic of serine hydrolase family found conserved in homologs of esterase. gly-X-Ser-X-Gly
Catalase Examples of catalase homologs discovered in metagenomic sequences – Sludge /US phrap community with ID 2000613520, with 83% sequence identify – Whalefall (ID 2001431860) with 78% sequence identity to enzyme Using structure (PDB ID: 2ISA) as the template, 3D models were built for the catalase homologs discovered in various metagenomic communities. Ref: Lorentzen E., Moe, H., Willansen N.Cold adapted features of Vibrio salmonicida catalase: characterisation and comparison to the mesophilic counterpart from Proteus mirabilis.(2006). 10:427-440
Structural Models of Predicted Catalases a) The experimental structure of catalase (PDB ID: 2ISA) used as the template b) Predicted structure of a homolog identified in sludge phrap assembly (ID: 2000613520) with 83% sequence identity. c) Predicted structure of a homolog from whale fall sample (ID: 2001431860) with 78% sequence identity.
Isocitrate Dehydrogenase Isocitrate dehydrogenase structure (PDF ID: 1J1W). Ref: Maki M., Takada Y. Two Isocitrate dehydrogenase from a psychrophilic bacterium Colwellia Psychrerythrea. (2006). Extremophiles 10:237-249. Predicted Isocitrate dehydrogenase structure from sludge phrap assembly( ID 2000231240) with 65% identity
A membrane bound alpha glucosidase (531 amino acids) isolated from extremophile Ferroplasma acidiphilum was used as the query in homolog search In case of glucosidase, the carboxylic side chains of glutamic and aspartic acids are involved in catalysis, but this novel glucosidase from extremophile has a catalytic center involving threonine-212 and histidine-390. Identified homologs include a protein (ID: 638394706) from ferroplasma acidarmanus, which has 99% sequence identity. Ref: Ferrer M., Golyshina O., Plou F., Timmis K., Golyshin P. A novel alpha- glucosidase from the acidophilic archeon Ferroplasma acidiphilum strain Y with high transglycosylation activity and an unusual catalytic nucleophile. (2005) 391: 269-276. Alpha Glucosidase
Multiple Sequence Alignment of Alpha Glucosidase and Homologs Histidine-390 found to be conserved in two homologs Threonine-212 found to be conserved in almost five homologs.
A Web Resource for Extremozymes and their Homologs We created a MySQL database to deposit the homologs of the extreme enzymes and the analysis results and implement an online search tool. Two tables were created, one for extremozymes, and one for homologs.
A Web Resource for Extremozymes and their Homologs
Conclusion We predicted 3D structures, active sites for extremozymes predicted from metagenomic sequences. Web resource is set up to deposit the data for extremozymes and their homologs.
Future prospects Intensive study and discovery of various enzymes (not limited to extremozymes ) in metagenomic sequences Explore other sequence based approaches for active site prediction and implement online tool Study structure – function relations, domain studies using predicted 3D models.
Acknowledgements Thanks to Primary Advisor- Dr. Yuzhen Ye Co- advisor – Dr Sun Kim Prof Adrian German, CS department Kwangmin Choi Linda Hostetter for her support throughout. Rachel Lawmaster Bioinformatics Faculty and Staff, School of Informatics. Thank You.