Presentation is loading. Please wait.

Presentation is loading. Please wait.

Structural Genomics of Pathogenic Protozoa Christopher Mehlin Protein Production and Crystallization Workshop 2004

Similar presentations


Presentation on theme: "Structural Genomics of Pathogenic Protozoa Christopher Mehlin Protein Production and Crystallization Workshop 2004"— Presentation transcript:

1 Structural Genomics of Pathogenic Protozoa Christopher Mehlin cmehlin@u.washington.edu Protein Production and Crystallization Workshop 2004 WWW.SGPP.ORG

2 The SGPP is focused on protozoa which cause human disease Malaria – Plasmodium falciparum, P. vivax Leishmaniasis – Leishmania major + 8 others African sleeping sickness – Trypanosoma brucei Chagas’ disease – Trypanosoma cruzi These diseases afflict ~500 million people per year; roughly half the world’s population is at risk.

3 These targets are challenging! Eukaryotic organisms Leishmania –Only L. major sequence is known (more coming…) Plasmodium falciparum –80% AT-rich genome –Requires cDNA – intron prediction difficult –Floppy loops e.g. CDK-2 has 83 asparagines in a row

4 Primers-to-Protein Normally ~5% Overall Yield Data from 1318 L. major and 368 P. falciparum targets L. major 5.2% P. falciparum 4.9% >85% of our effort is put into cloning, screening, and expressing this 5%

5 Protein Variants Increase the Odds Multiple species variants –Especially Leishmania “Chunking” –Computational domain prediction –Random truncation

6 L. major L. aethiopica L. infantum L. donovani L. tropica L. mexicana L. guyanensis L. naiffi L. braziliensis L. tarentolae E. scheideri Homology 97% 60% Human pathogens Primers designed for L. major can fish out homologues from other species

7 L. major L. aethiopica L. infantum L. donovani L. tropica L. mexicana L. guyanensis L. naiffi L. braziliensis L. tarentolae E. scheideri Homology 97% 60% PCR success using L. major primers 83% 10% Primers designed for L. major can fish out homologues from other species

8 Multiple species targeted with a list of 40 high-value targets (enzymes with known inhibitors) P. falciparum4 L. major4 Organism 1 2 3 4 5 6 7 Target Number Two species gave us eight proteins and 7/40 (18%) of the targets. HOMOLOGUES

9 Multiple species targeted with a list of 40 high-value targets (enzymes with known inhibitors) P. falciparum4 L. major4 L. infantum3 Organism 1 2 3 4 5 6 7 8 9 10 Target Number 95% IDENTICAL No overlap! Small changes in sequence make an enormous difference in the behavior of the protein.

10 Multiple species targeted with a list of 40 high-value targets (enzymes with known inhibitors) P. falciparum4 L. major4 L. infantum3 L. mexicana3 L. guyanensis2 L. tarentole1 L. braziliensis2 Organism 1 2 3 4 5 6 7 8 9 10 11 12 13 14 TOTAL: 19 proteins, 14 of 40 (35%) of targets 10 targets would not have been obtained otherwise Target Number

11 Multiple species variants help crystallization, too! 1 60 Lmaj001686 MSRLMPHYSKGKTAFLCVDLQEAFSKRIENFANCVFVANRLARLHELVPENTKYIVTEHY Ldon001686 MSRLMPHYSKGKTAFLCVDLQEAFSKRIENFANCVFVANRLARLHEVVPENTKYIVTEHY 61 120 Lmaj001686 PKGLGRIVPGITLPQTAHLIEKTRFSCIVPQVEELLEDVDNAVVFGIEGHACILQTVADL Ldon001686 PKGLGRIVPEITLPKTAHLIEKTRFSCVVPQVEELLEDVDNAVVFGIEGHACILQTVADL 121 180 Lmaj001686 LDMNERVFLPKDGLGSQKKTDFKAAMKLMGSWSPNCEITTSESILLQMTKDAMDPDFKKI Ldon001686 LDMNKRVFLPKDGLGSQKKTDFKAAIKLMSSWGPNCEITTSESILLQMTKDAMDPNFKRI 181 193 Lmaj001686 SKLLKEEPPIPL. Ldon001686 SKLLKEEPPIPL. 95% IDENTITY Lmaj001686AAA nice crystals, no diffraction Ldon001686AAA “huge” crystals, 2.7Å diffraction

12 Consider a 3-domain protein: Standard chunks would be the entire protein, each individual domain, and any contiguous series of domains. A 3 domain protein therefore becomes 6 chunks. Full length Adjacent domains Single domains The concept of chunking… N(N+1) 2

13 Domain Parsing using GINZU Step 1: PSI-Blast against the PDB Step 2: Use consensus fold recognition methods to find remote PDB matches PDB Fold Recognition PDBFold Recognition Step 3: Search PFAM database for preassigned modular “chunks” Pfam Step 4: Identify new modular “chunk” regions in multiple sequence alignment PDBFold Recognition Pfam Final Step: Select cut points in linker regions using assigned boundaries and coil predictions MSA Target Sequence Confidence PDBPfamMSAFold Recognition PDBFold Recognition PfamMSA Step 5: Identify parse points in Rosetta structure predictions Rosetta Chunk Generation David Kim, UW

14 Pfal006650AAA Example - tRNA Synthetase PFAM, PDB, and MSA coverage Ginzu Domains 1.No assignment but still based on MSA (remaining region) 2.PFAM hit to PF01411 tRNA synthetases class II (A) 3.PDB hit to 1nyqA (Threonyl-tRNA Synthetase) 4.MSA based assignment Ginzu Parse Results w/ Multiple Sequence Alignment PSI-BLAST against Non-redundent (NR) sequence database PFAM PDBMSARemaining Region David Kim, UW

15 71 ORFs 12/66 inaccessible proteins have had at least one soluble chunk (18%) 17/71 proteins accessible via this technique (24%) CHUNKING L. major PROTEINS GINZU 205 Chunks (not counting full length) 5 ORFs solubly expressed (7%) 15 chunks solubly expressed (7%) 11 ORFs had 1 soluble chunk 2 ORFs had 2 chunks soluble 2/16 chunks of soluble ORFs soluble (both of the same ORF) 1 chunk of non-crystallizing, soluble ORF crystallized

16 Superchunking: for high-value targets Step 1: Determine functional domain of protein by comparison to known protein: Functional Domain Step 2: Determine 10 truncation sites on each side of functional domain; Make 20 primers. Functional Domain Step 3: Run 10x10=100 PCRs, clone products, screen for soluble expression, crystallizability

17 Superchunking Thioredoxin Reductase from P. falciparum ► 20 different soluble proteins from 90 cloned constructs. ► PCR success 100% -- used template of full-length PCR Erica Boni ►TR is a 60.7 kDa enzyme with a high degree of domain interaction

18 Erica Boni NATIVE Superchunking Thioredoxin Reductase

19 Erica Boni 18 off N-terminus & 8 off C-terminus 16 off C-terminus 7 off N-terminus Superchunking Thioredoxin Reductase

20 Conclusions: Relatively small changes in protein sequence can have dramatic effects on the behavior of proteins in expression and crystallization. Multiple species and chunking are two promising methods for obtaining protein variants.

21 Acknowledgements: University of Washington –Jamie Andreyka, Erica Boni, Tiffany Feist, Lutfiyah Haji, Colleen Liu, Natascha Mueller –Fred Buckner, Mike Gelb, Wes VanVoohris, Kevin Bauer –David Baker, David Kim, Erkang Fan, Stan Fields Group –Wim Hol and Hol group Seattle Biomedical Research Institute –Liz Worthey, Ellen Sisk, Peter Myler Hauptman Woodward Medical Research Institute –George Detitta, Joe Luft, Nancy Fehrman, Angela Luricella et al. Seattle Crystallization and Structure Determination Units –Oleksandr Kalyuzhniy, Lori Anderson –Ethan Merritt, Isolde Le Trong, Mark Robien Collaborators: –SSRL Stanford –ALS Berkeley NIH/NIGMS/NIAID


Download ppt "Structural Genomics of Pathogenic Protozoa Christopher Mehlin Protein Production and Crystallization Workshop 2004"

Similar presentations


Ads by Google