Presentation is loading. Please wait.

Presentation is loading. Please wait.

Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009.

Similar presentations


Presentation on theme: "Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009."— Presentation transcript:

1 Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009

2 Outline Background Steps in NMR protein structure determination The ACE cycle (Assign-Calculate-Evaluate) The assignment problem Algorithms for automated NOE assignment Semi-automated methods More-automated methods Conclusions

3 The Steps in Protein Structure Determination by NMR 1. Sample preparation 2. Data collection 3. Data evaluation 4. Structure calculation 5. Structure refinement 6. Structure deposition

4 The Steps in Protein Structure Determination by NMR 1. Sample preparation (a) protein selection (b) gene engineering (c) protein expression (d) protein purification (e) buffer optimization (f ) isotope labeling 2. Data collection 3. Data evaluation 4. Structure calculation 5. Structure refinement 6. Structure deposition (and maybe write a paper and graduate)

5 The Steps in Protein Structure Determination by NMR 1. Sample preparation (a) protein selection (b) gene engineering (c) protein expression (d) protein purification (e) buffer optimization (f ) isotope labeling 2. Data collection (a) HSQC (b) amide H/D exchange (c) triple-resonance 3. Data evaluation 4. Structure calculation 5. Structure refinement

6 The Steps in Protein Structure Determination by NMR 1. Sample preparation (a) protein selection (b) gene engineering (c) protein expression (d) protein purification (e) buffer optimization (f ) isotope labeling 2. Data collection (a) HSQC (b) amide H/D exchange (c) triple-resonance 3. Data evaluation (a) spectrum calculation (b) peak picking

7 Automatable Steps in Protein Structure Determination by NMR 1. Sample preparation 2. Data collection 3. Data evaluation 4. Structure calculation 5. Structure refinement 6. Structure deposition

8 Fig. 2 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert. The Assign Calculate Evaluate cycle in automated NOE assignment and structure calculation.

9 Automating NOE Assignments and THE Assignment Problem

10 Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment 2. NOE Assignment

11 Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment (interpreting data) 2. NOE Assignment (interpreting data)

12 Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment 2. NOE Assignment and one major assignment problem. ambiguous assignments Due to the data collection problems of 1. Completeness 2. Uniqueness

13 Automating NOE Assignments and THE Assignment Problem There are MANY assignment tasks 1. Resonance Assignment 2. NOE Assignment and one major assignment problem. ambiguous assignments Due to the data collection problems of 1. Completeness (missing data points) 2. Uniqueness (unresolvable data points)

14 from Fig. 3 (2003) Progress in NMR Spectroscopy, 43, 105, Guntert. Unambiguously assigning a NOESY cross peak

15 Automated NMR Protein structure calculation Peter Guntert (2003) Progress in NMR Spectroscopy, 43, 105-125 Algorithms for automated NOESY assignment Semi-automated methods 1. ASsign NOEs (1993) 2. Structure Assisted NOE Evaluation (2001)

16 Automated NMR Protein structure calculation Peter Guntert (2003) Progress in NMR Spectroscopy, 43, 105-125 Algorithms for automated NOESY assignment Semi-automated methods 1. ASsign NOEs (1993) 2. Structure Assisted NOE Evaluation (2001) More-automated methods 1. NOAH (1995) 2. Ambiguous Restraints Iterative Assignments (1997) 3. AutoStructure (1999) 4. KNOWledge-based NOE assignments (2002) 5. CANDID (2002)

17 ASNO (1993) Guntert, Berndt, & Wuthrich Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Set of estimated structures User specifies 1.     = max allowed chemical shift error 2. d max = max interproton distance causing NOE 3. n min = min # structures with d < d max

18 ASNO (1993) Guntert, Berndt, & Wuthrich Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Set of estimated structures User specifies 1.     = max allowed chemical shift error 2. d max = max interproton distance causing NOE 3. n min = min # structures with d < d max Algorithm steps 1. each cross peak: find all poss. assignments ( 1 H j, 1 H k ) 2. each ( 1 H j, 1 H k ): n = # of structures with d < d max 3. Prune all ( 1 H j, 1 H k ) with n < n min User intervention 1. Manually check and refine NOE assignments ( 1 H j, 1 H k ) 2. Refine set of structures and rerun algorithm

19 Fig. 1 (1993) J Biomol NMR, 3, 601, Guntert, Berndt, & Wuthrich. demo: Dendrotoxin K, 7kDa, 57AA, bbRMSD = 0.32Ang

20 SANE (2001) Duggan, Legge, Dyson, & Wright Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) User specifies Filters 1. Distance (Set of estimated structures) 2. Chemical Shift (     = max allowed error) 3. Secondary structure (unlikely NOE assignments) 4. Assignment (expected NOE assignments) 5. NOE contribution (same as in ARIA method)

21 SANE (2001) Duggan, Legge, Dyson, & Wright Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) User specifies Filters 1. Distance (Set of estimated structures) 2. Chemical Shift (     = max allowed error) 3. Secondary structure (unlikely NOE assignments) 4. Assignment (expected NOE assignments) 5. NOE contribution (same as in ARIA method) Algorithm steps 1. each cross peak: find all poss. assignments ( 1 H j, 1 H k ) 2. Apply five filters to prune list of ( 1 H j, 1 H k ) 3. Write unique or ambiguous dist restraints, or violations User intervention 1. Violation analysis

22 Fig. 1 (2001) J Biomol NMR, 19, 321, Duggan, et al. demo: LFA-1 I -domain, 21.3kDa, 183AA, bbRMSD = 0.29Ang

23 NOAH (1995) Mumenthaler & Braun Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Scalar coupling constants ( 3 J NH  ) Algorithm calculates 1. Distance constraints from NOE assignments 2. Angle constraints from scalar couplings

24 NOAH (1995) Mumenthaler & Braun Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Scalar coupling constants ( 3 J NH  ) Algorithm calculates 1. Distance constraints from NOE assignments 2. Angle constraints from scalar couplings Algorithm uses 1. Structure-based filter (recognizes correct constraints) 2. Chemical Shift limit (     = max allowed error) 3. Error-tolerant target function in DIAMOD (1994) (minimizes effect of incorrect distance constraints from incorrect NOE assignments)

25 Fig. 1 (1995) J Mol Biol, 254, 465, Mumenthaler & Braun demo: 3 proteins ranging from 57 to 74 residues

26 (1995) J Mol Biol, 254, 465, Mumenthaler & Braun NMRa/b=DEN=57, TEN=74, REP=69 residues

27 ARIA (1997) Nilges, et al. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Assignment cutoff, p, decreases for each cycle 5. (opt) preliminary structures, manual assignments 6. (opt) RDCs, scalar couplings, d-angles, S-S or H-bonds Algorithm calculates in each cycle 1. Unique and partial NOE assignments 2. Unique and ambiguous distance restraints 3. Merges distance restraints with other input data 4. Bundle of refined structures (typically 20)

28 ARIA (1997) Nilges, et al. An NOE cross peak with more than one possible assignment is considered as a weighted composite of all of them. Ambiguous distance restraints introduced to incorporate d k of each ambiguous NOE assignment. Ambiguous restraints To reduce the number of assignment possibilities each relative contribution C k is calculated from d k and the average distance for all possible assignments from the lowest n of 20 conformers from the previous cycle. The largest C k that add up to the cutoff value, p, for that cycle are kept, the rest are discarded.

29 Fig. 1 (1997) J Mol Biol, 269, 408, Nilges, et al. demo:  -spectrin PH domain, 106 residues

30 Table 1 (1997) J Mol Biol, 269, 408, Nilges, et al.  -spectrin PH domain, 106 residues MAN data derived from manual assignments 80ms and 30ms data differ only in mixing times

31 AutoStructure (1999) Moseley & Montelione Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Scalar couplings 5. Slow amide H/D exchange data 6. Preliminary structure 7. Preliminary H-bonded pairs Algorithm calculates 1. Distance restraints 2. Dihedral angle restraints 3. H-bonding pairs 4. Refined structures

32 Fig. 1 (1999) Curr. Opin. Struct. Biol., 9, 635, Moseley & Montelione. (& Y.J. Huang PhD thesis) basic fibroblast growth factor (127 residues) (a)10 NMR-derived structures bbRMSD = 0.7 Ang. between (b) manual and AutoStructure-derived structures

33 KNOWNOE (2002) Gronwald, et al. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. NOESY cross peak volume probability distribution 5. Preliminary structure User specifies 1.     = max allowed chemical shift error 2. initial value of d max = max interproton distance 3. Number, N, of current best structures

34 KNOWNOE (2002) Gronwald, et al. Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. NOESY cross peak volume probability distribution 5. Preliminary structure User specifies 1.     = max allowed chemical shift error 2. initial value of d max = max interproton distance 3. Number, N, of current best structures Algorithm, working together with CNS, iteratively will 1. build A-list of uniquely assigned NOE cross peaks 2. calculate P(A k, a | V o ) for all other peaks 3. add to A-list all peaks with P(A k, a | V o ) < cutoff (0.8-0.9) 4. use current A-list to calculate N structures

35 KNOWNOE (2002) Gronwald, et al. The problem of ambiguous assignments is addressed with a Bayesian algorithm based on NOE cross peak volume probability distributions derived from 326 spectra. P(A k, a | V o ) = probability that more than fraction a of cross peak volume V o is due to assignment k If P(A k, a | V o ) > cutoff value (typically 0.8 to 0.9) then consider that peak assigned to k for the next cycle. These authors state that their algorithm is “Based on the observation that cross peak volume and correct cross peak assignment are not independent of each other”.

36 Figures 3 & 4 (2002) J. Biomol. NMR, 23, 271, Gronwald, et al. Probability distributions of distance (left) and volume (right)

37 CANDID (2002) Hermann, Guntert & Wuthrich Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Previously assigned NOE distance constraints 5. (opt) other conformational constraints User specifies 1.     = max allowed chemical shift error 2. Cycle-dependent parameters (thresholds, cutoffs, etc.)

38 from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

39 CANDID (2002) Hermann, Guntert & Wuthrich Input “data” 1. Protein’s amino acid sequence 2. Proton resonance assignments 3. NOESY cross peak list (of pairs (  j    j ) ) 4. Previously assigned NOE distance constraints 5. (opt) other conformational constraints User specifies 1.     = max allowed chemical shift error 2. Cycle-dependent parameters (thresholds, cutoffs, etc.) Algorithm uses 1. Structure-based filters (like NOAH) 2. Ambiguous distance constraints (like ARIA) 3. Network anchoring (new) 4. Constraint combination (new)

40 Fig. 1 (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

41 CANDID (2002) Hermann, Guntert & Wuthrich ways to handle problems caused by no preliminary structure in first cycle 1. Network anchoring “… evaluates the self-consistency of NOE assignments independent of knowledge of the 3D protein structure.” “… a sensitive approach for detecting erroneous ‘lonely’ constraints …” 2. Constraint combination “… an extension of the concept of ambiguous NOE assignments.” “… reduces the impact of unidentified artifact constraints in the input for the first structure calculation.” Result: “The correct fold is obtained in cycle 1 of a de novo structure calculation.”

42 from (2002) J. Mol. Biol., 319, 209, Hermann, Guntert, & Wuthrich.

43 Questions ? Conclusions

44


Download ppt "Automating Steps in Protein Structure Determination by NMR CS 296.4 April 13, 2009."

Similar presentations


Ads by Google