Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nmrbox.org TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis Matt Fenwick, Eldon Ulrich, Michael Gryk.

Similar presentations


Presentation on theme: "Nmrbox.org TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis Matt Fenwick, Eldon Ulrich, Michael Gryk."— Presentation transcript:

1 nmrbox.org TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis Matt Fenwick, Eldon Ulrich, Michael Gryk

2 nmrbox.org Overview of NMR spectral analysis peak-picking: distinguishing S from N true positives from false resonance assignment NOESY peak assignment semi-automated - software tools - human intervention required human uses deductive process of reasoning - small set of rules/expectations (library) - deductions may be logically dependent on each other L10 A5 + +

3 nmrbox.org Problem: Missing Data -> Irreproducible Much intermediate data is not saved / deposited - step order - logical dependencies - deductive reasoning - peculiarities found and their resolutions (unexpected, missing, extra peaks) final data - resonances, spin systems - extraneous data -- contaminants, noise, artifacts, anomalies...

4 nmrbox.org Missing Data: Spin Systems & Resonances NMR experiments are designed to exploit networks of coupled spins (spin systems). The assignment process is 2-step: (1) assign resonances to spin systems, (2) assign spin-systems to residues Resonance and spin-systems are not deposited. Images are from Protein NMR: A Practical Guide (http://www.protein-nmr.org.uk/)

5 nmrbox.org Solution 1. capture process of reasoning - version control: capture intermediate states - model of commonly used deductive reasons - annotate changeset with deductive reasons 2. capture complete final data set - model for identifying problems - model for extraneous data - deposit full results

6 nmrbox.org 1. version control -- snapshots, commit message snapshots of intermediate states: enables backtracking, inspecting of past states describe difference between consecutive snapshots; summary, purpose, justification, questions, uncertainties

7 nmrbox.org 1. model of NMR deductive reasoning start with CCPN data model augment with library of common deductive reasons use deductive reasons to annotate commits

8 nmrbox.org 2. model: identify problems (distinguishing signal from noise; true positives, false positives, false negatives) facilitates re-interpretation, if additional data is collected, by pointing out trouble spots missing CB peaks of Gln sidechain unassigned signal peak

9 nmrbox.org 2. extraneous data, full results collaborate with BMRB: deposit full data sets extend NMR-Star data dictionary extend Sparky assignment program noise & artifact peaks, unassigned spin systems, contaminants, anomalies,...

10 nmrbox.org Review: Solution 1. process of reasoning - version control: capture intermediate states - model of commonly used deductive reasons - annotate changeset with deductive reasons 2. final data - model for identifying problems - model for extraneous data - deposit full results

11 nmrbox.org Challenges? - human/computer optimization - simple enough for users to apply properly, vs. detailed enough that a program can understand complete context of an annotation - separate layers: use more/less detail as needed - (future) tools can increase level of detail without bogging humans down - future compatibility - library of annotations provides “guidance”; extensions can be trivially added by augmenting library - if there’s a problem with the library of annotations, can fix by extending (providing a new, similar annotation) - tooling - Sparky

12 nmrbox.org Annotation Mock up (STAR-like format) data_example save_assign loop_ # tags _Tag.ID _Tag.Parent_ID...... 24 23 stop_ loop_ # reasons used _Tag_Reason.ID _Tag_Reason.Tag_ID _Tag_Reasons.Name...... 73 24 "BMRB statistics" 74 24 "chemical shift grouping" stop_ loop_ # spin-system/amino-acid-type assignment _SSAA_Assn.ID _SSAA_Assn.SS_ID _SSAA_Assn.AA_ID...... 101 52 Alanine stop_ loop_ # peak/spin-system assignment _Peak_SS_Assn.ID _Peak_SS_Assn.SS_ID _Peak_SS_Assn.Peak_ID _Peak_SS_Assn.Peak_Spectrum...... 175 52 124 HNCACB 176 52 125 HNCACB 177 52 126 HNCACB 178 52 127 HNCACB stop_ save_

13 nmrbox.org Impact - reproducibility - error detection - error correction - collaboration - sharing - learning - analysis quality - amenability to future analysis

14 nmrbox.org Appendix: NMR phenomena: grouping resonances based on chemical shift

15 nmrbox.org Appendix: extraneous data: processing artifacts, spurious peaks

16 nmrbox.org Appendix: Library examples Asn sidechain Ala backbone sequential spin systems


Download ppt "Nmrbox.org TRD 2 Update: An annotation scheme to foster reproducible NMR data analysis Matt Fenwick, Eldon Ulrich, Michael Gryk."

Similar presentations


Ads by Google