Presentation is loading. Please wait.

Presentation is loading. Please wait.

Integration of Fast Data Collection and Automated Probabilistic Assignment for Protein NMR Spectroscopy Arash Bahrami.

Similar presentations


Presentation on theme: "Integration of Fast Data Collection and Automated Probabilistic Assignment for Protein NMR Spectroscopy Arash Bahrami."— Presentation transcript:

1 Integration of Fast Data Collection and Automated Probabilistic Assignment for Protein NMR Spectroscopy Arash Bahrami

2 Protein Structure determination by NMR Sample Preparation Data collection Peak Picking Backbone resonance assignment Sidechain resonance assignment Secondary structure determination NOE data collection and assignment Structure calculation and refinement Individual software packages have been developed for each part but no integrated tool is available for the whole process. Integration needs interaction of individual components Probabilistic framework can provides robust interaction of components Automation in NMR On the average 1-4 months 80k$ per structure 1 2 3

3 Individual tools developed in CESG and NMRFAM PISTACHIO (Automated resonance assignment) PECAN (Secondary structure determination) MANI-LACS (Reference correction and outlier detection) HIFI-NMR (Fast and adaptive NMR data collection) HIFI-C (Adaptive determination of NMR couplings) 1 Hamid R. Eghbalnia, Arash Bahrami, Liya Wang, Amir Assadi, and John L. Markley (2005) J. Biomol. NMR, 32(3): Hamid R. Eghbalnia, Liya Wang, Arash Bahrami, Amir Assadi, and John L. Markley (2005) J. Biomol. NMR, 32(1): Liya Wang, Hamid R. Eghbalnia, Arash Bahrami, and John L. Markley (2005) J. Biomol. NMR, 32(1): Hamid R. Eghbalnia, Arash Bahrami, Marco Tonelli, Klaus Hallenga, and John L. Markley (2005) J. Am. Chem. Soc., 127(36) – Gabriel Cornilescu, Arash Bahrami, Marco Tonelli, John L. Markley, Hamid R. Eghbalnia. (2007) J. Biomol. NMR, 38(4):

4 PISTACHIO Native probabilistic PISTACHIO output Residue_Name P(H,N) H N CO CA CB P(H,N) H N P(H,N) H N P(H,N) H N P(no_assignment) 1 MET ASN THR VAL CYS NMR-star format 1 1 MET CA C MET CB C ASN N N ASN H H ASN CA C ASN CB C THR N N Overall view of the assignment probabilities PISTACHIO is a probabilistic method for backbone and sidechain assignment. The input to PISTACHIO can be a any subset of following NMR experiments: HSQC HNCO CBCA(CO)NH HN(CA)CB C(CO)NH HBHA(CO)NH HN(CO)CA HN(CA)CO HN(CO)(CA)CB H(CCO)NH HCCH-TOCSY HNCACB HN(CO)CACB HNCA

5 PECAN Helix Extended PECAN optimizes a combination of information sources to yield energetic descriptions of secondary structure and constructs a probabilistic description wherein each residue is assigned a probability of belonging to a designated state (e.g. helix, sheet, etc.). PECAN is available at:

6 LACS MANI-LACS3 (Linear Analysis of Chemical Shifts for reference correction and outlier detection) can detect potential outliers using linear analysis of chemical shifts. An outlier may be the result of miss assignment of chemical shifts. MANI-LACS reports probabilities for the presence of outliers. MANI- LACS is available at:

7 2D planes of 3D CBCA(CO)NNH experiment collected on 800 MHz Varian Inova spectrometer HIFI-NMR: High-Resolution Iterative Frequency Identification for NMR Tilted-plane reduced dimensionality data collection that employs on-the-fly peak identification, spectral modeling, and selection of the next data plane to be collected.

8 Simplified Description of the HIFI NMR Approach find a tilt angle that maximizes a dispersion function f  (p) Has the last tilted plane added new information ??? YES collect tilted plane X° NO peak list dispersion function, f  (p), measures the dispersion of the putative peaks on the selected tilted plane orthogonal planes 0° 90° predicted chemical shift distribution assign a probability of a peak being in a given voxel, p probability color map

9 HIFI application to automated backbone assignments HIFI - Data collection time PINE – Assignment time Assignment accuracy WT Brazzein 53 a.a. 12h5m98% Ubiquitin 76 a.a. 14h5m98% Flavodoxin 176 a.a. 48h2h85%

10 HIFI–C: A Fast and Robust Method for Determining NMR Couplings from Adaptive 3D to 2D Projections Correlation and RMSD comparison of couplings collected by HIFI-C and 3D. Agreement between the two was within experimental error. (A) GB3 protein (R = 99.8%, rmsd = 0.03 Hz). The total data collection times were 1.7 h for HIFI-C and 7.9 h for 3D. (B) PRP24-12 protein (R = 94.0%, rmsd = 0.25 Hz). The total data collection times were 14.6 h for HIFI-C and 44.1 h for 3D.

11 HIFI-NMR PISTACHIO PECANMANI-LACS HIFI-C Back to Automation Steps in NMR Proteomics

12 Redesign the Individual Tools to Provide Robust Probabilistic Interaction: PINE MANI-LACS PISTACHIO PECAN PINE

13 General Overview of Probabilistic Network Defined by PINE

14 Amino Acid Typing Network Spin System Generation Network

15 Table 1. PINE performance result and comparison with PISTACHIO for the proteins that BMRB assignment are available. Protein designator Number of Residue PINEPISTACHIO Experiments represented in the input peak lists‡ CPU time (h) Assignment accuracy* Secondary structure accuracy CPU time (h) Assignment accuracy* At2g %95%1 ** At1g %94%0.295%* At2g %92%0.198% AAH %97%0.290%*** At5g %90%588%*** At3g %90%6 ****** At3g %88%587%****** At5g %83%6 70%*** At3g16450† %NA773%******* BMRB %90%195%** * Correct assignments is final structure and assignment deposited on PDB and BMRB † Stereo array isotope labeled (SAIL) protein; isotope shifts due to labeling were not accounted for. ‡ Each data set included an HSQC or HNCO experiment; other experiments are indicated by numbers: 1 CBCA(CO)NH or HN(CO)CACB2 HNCACB 3 HNCA4 HN(CO)CA or CA(CO)NH 5 HN(CA)CO6 H(CCO)NH or N15 TOCSY 7 C(CO)NH8 HBHA(CO)NH

16 PINE Web Server

17 PINE Server Statistics Total Number of jobs submitted since July 2006: 1175 jobs

18 Iterative HCCH-TOCSY assignment HBHA(CO)NH C(CO)NH H (CCO)NH HCCH-TOCSY

19 PINE, HIFI and Time Saving in NMR Proteomics Time SavingAccuracyMain cause of possible inaccuracy What may need to be done manually HIFI12 hours – 2 days data collection VS 1 week – 2 weeks traditional methods 95%-100% peaks recovered with high probability depends on the size and the complexity of protein. Some of the peaks may have very low intensities (in the noise level). They will have lower probabilities in the final peak list. Manual analysis maybe needed to derive the remaining peaks from the lower probability list. PINEFull Assignment in anytime between 5 min – 2 hours VS 1 week – 1 month manual assignment 85%-100% correct assignment depends on the size and the complexity of protein. Some of the real peak are missing in the peak lists. Manual assignment of the remaining peaks can be easily done by scanning the spectra.

20 HIFI-NMR Fast data collection and peak identification Referencing and outlier check Automated assignment Secondary structure determination PISTACHIO MANI-LACSPECAN PINE On going project: Integration of HIFI and PINE

21 (A) HNCA (HC plane) 512 zero filling; 0.15 delay in sine window function (B) HNCA (HC plane) 1024 zero filling; 0.45 delay in sine window function (C) Difference between spectra (A) and (B) XYProbability ……… (D) Probabilistic peak lists are generated for every plane based on different parameter settings and peaks volume. Probabilistic Analysis of Spectra in HIFI

22 On Fly Spin System Generation in HIFI

23

24

25

26

27

28

29 Find the optimum experiment and tilted angle The optimum is the plane that maximizes the information regarding the ambiguous or missing position in spin systems considering latest state of chemical shift assignment. YES collect the optimal tilted or orthogonal plane X° Report the final peak lists, chemical shift assignments, and secondary structure Collect N15-HSQC Predicted chemical shift distribution Spectra Analysis Generate probabilistic peak list Derive the initial probabilistic spin systems Spectra Analysis: Generate probabilistic peak list Update the probabilistic spin system Is the spin system network quality good enough for the assignment process? PINE Derive the latest assignment and secondary structure Are the assignment and secondary structure complete? Collect the most sensitive orthogonal plane 0° YES NO

30 HIFI-NMR Fast data collection and peak identification Referencing and outlier check Automated assignment Secondary structure determination PISTACHIO MANI-LACSPECAN NOESY Assignment PINE

31 Acknowledgements John Markley Hamid Eghbalnia Marco Tonelli All CESG member providing data: Claudia Cornilescu Shanteri Singh Jikui Song Brian Volkman Francis Peterson Ziqi Dai Gabriel Cornislescu Klaus Hallenga Milo Westler Liya Wang Eldon Ulrich


Download ppt "Integration of Fast Data Collection and Automated Probabilistic Assignment for Protein NMR Spectroscopy Arash Bahrami."

Similar presentations


Ads by Google