Presentation is loading. Please wait.

Presentation is loading. Please wait.

The CCPN Project Tim Stevens and Wayne Boucher October 2005.

Similar presentations


Presentation on theme: "The CCPN Project Tim Stevens and Wayne Boucher October 2005."— Presentation transcript:

1 The CCPN Project Tim Stevens and Wayne Boucher October 2005

2 CCPN at Göteborg: Day 1 ■ Introduction to CCPN ■ The CcpNmr applications ■ Analysis basics ■ Future developments ■ Analysis advanced

3 CCPN at Göteborg: Day 2 ■ An overview of the data model ■ API Tutorial ■ Analysis Macros ■ Widgets and Popups

4 CCPN Overview

5 The CCPN Project ■Collaborative Computing Project for NMR ●Started in 1999 ●Collaborators in several countries ●Developers at University of Cambridge and EBI ■Unifying platform for NMR software ●Similar to CCP4 (X-ray) ■Main goals: ●Data standards and data exchange ●Software development and distribution ●Meetings to determine and disseminate best practice ●Open source access

6 People ■Cambridge ●Ernest Laue ●Rasmus Fogh ●Dan O’Donovan ■EBI, Hinxton ●Kim Henrick ●John Ionides ●Wim Vranken ●Anne Pajon

7 History ■Workshops: ●EBI (2000, 2001) ●Washington (2000) ■Funding: ●BBSRC (2000-2003, 2003-2006) ●NMRQUAL (2001-2004) ●TEMBLOR (2002-2005) ●NMR-EXTEND (2005-2008)

8 NMR Software ■Problem - Heterogeneous development ●Lots of proprietary data formats ●Lots of stand-alone programs ●Data is ‘lost’ along the way ●Dedicated converters needed ●Not acceptable for structural genomics projects ■Solution - Unity ●Data standards ■Ease of transfer between programs ■Completeness, integrity, deposition, data mining ●Libraries

9 Data Format vs. Data Model ■Data format - How data is stored ●STAR ●XML ●SQL ●Tab-separated ascii ■Data model - What data means ●RCSB (PDB) mmCIF ●XML DTD or schemas ●SQL schema

10 CCPN Approach ■Data model rather than data format ●Format independent ●Language independent ●Scientifically descriptive (NMR) ■Library (API): in memory manipulation ●Create, update, delete & query objects ●One for each language ●Error checking ■I/O modules: load/store data from/to disk ●One for each (storage format, language) ●Bookkeeping

11 Application View User Application 1 Data Store (XML, SQL) Application 2 Application 3 In Memory Representation (Python, Java, C++, Perl) GUI API I/O

12 Model-Driven Architecture ■UML: Unified Modelling Language ●Abstract representation of semantics ●Pictorial ■Mapping from UML: to anything ●Multi-language ●Multi-format ●Architecture neutral (e.g. distributed or not) ■Power: good and bad ■CCPN uses Object Domain as its UML tool ●Python as scripting language

13 User Documentation Application Deposition APIs Python Java C Perl Storage SQL XML Handcoded(1%) UML Model Package 1 Package 2 Package 3 Autogeneration Domain Experts MEMOPS framework Program Developers

14 Data Model Packages Molecule Sequence NMR Citations Nuclei and Isotopes Experimental Protocols Organisms, Taxonomy CcpNmr Programs Compound Source Structure Targets Crystallisation Compound Preparation Project Tracking X-ray Crystallography Structure and Coordinates Residue Template Molecular System Reference Molecule Laboratory Samples

15 UML Example

16 CCPN API ■Classes for developers ●Mainly getters and setters ●More than just code stubs ●Constraints (e.g. cardinality) enforced ●Links the hard part ■Mostly (> 99%) auto generated from UML ●Some helper functions and constraints hand coded ■Currently around 360k lines in Python and 650k lines in Java

17 Developer Benefits ■Specified data model and API ■No I/O code ■Concentrate on science, not bookkeeping ■Extendible ●Application data can be assigned to any object ●UML model can be extended (packages) ■Notification system ●Register interest when specified attribute changes (class, not object, level) ■Undo/Redo (in future)

18 Current Status of API ■Stable and released: ●Python and XML code generation ●NMR, molecule description and structure data model ■In testing stages: ●Java and SQL database code generation ●Protein production data model ■Preliminary: ●X-ray crystallography data model

19 CcpNmr Applications

20 Structural Biology Pipeline NMR machine NMR machine Data processing Data processing Spectrum analysis Spectrum analysis Structure calculation Structure calculation Databases

21 NMR Applications CcpNmr Processing CCPN Data Model CcpNmr FormatConverter Reference data Other formats (NmrView, XEasy, …) NMRStar 3.0 CcpNmr Analysis ARIA 2.0 Validation software

22 Main CcpNmr Applications ■Format Converter ●Conversion to and from legacy formats ■Analysis ●Graphical analysis (e.g. assignment) program ■Processing (coming soon) ●Azara “process” wrapped in data model

23 CcpNmr Format Converter ■Import/export of data formats to the Data Model ●For harvesting/deposition purposes ●Allow people to use or try out the data model ●Interaction with existing programs ■Fully or partially handles: ●Ansig, Auremol, Autoassign, Azara, Bruker, Charmm, CNS/XPLOR/ARIA, Concoord, Diana/Dyana/Cyana, Discover, Fasta, Felix, Module,.mol, Molmol, Monte, NmrDraw, NMRPipe, NMR-STAR (v2.1.1, v3.0), NmrView, Pdb, Pipp, Pistachio, Pronto, Sparky, Talos, Varian, XEasy ●Sequences, chemical compounds, coordinates, NMR measurements, constraints and peak lists, processing and acquisition parameters.

24 Format Converter - The NMR Translator CCPN Data Model PeaksChemical shifts Acquisition parameters XEasyNmrViewXEasyNmrViewBrukerVarian... Generic peak converter Generic chemical shift converter Generic acquisition parameters converter Processing parameters XEasy NmrViewNMRPipeAzara... NmrView Format specific readers Data model entry Format specific writers Chemical shiftsPeaks

25 Format Converter Design ■Wim Vranken (EBI) ■Set of Python scripts ■Accessed via: ●Tkinter (Tcl/Tk) ●custom Python scripts ■http://www.ebi.ac.uk/msd- srv/docs/NMR/NMRtoolkit/main.html

26 CcpNmr Analysis ■Requirements ●Cross platform ●Scalable ●Extensible ●Open and easy scripting language ●Modern graphical user interface ●Uses CCPN data model and API ■Software ●Python, Tcl/Tk, C, OpenGL ●(Java, X, Motif) ■OS ●Linux, Sun, SGI, OSX (Windows)

27 Spectrum Windows ■N-dim. windows ■Multiple spectra ■Automatic mapping ■Contours on fly ■Aliasing ■Strips & cells ■Mouse and key ■Blocked data ●Azara ●Felix ●NMRPipe ●UCSF

28 Graphical Interface ■Menus and popup dialogues ●CcpNmr widgets ■Main objects ●Spectra ●Windows ●Peaks ●Resonances ●Molecules ●Structures

29 Assignment ■Peak finding and fitting ■Rich assignment model ■Mainly mouse-driven ■Can assign to atoms ■Ambiguous contributions ■Existing structure ■Short resonance list ■Multiple peaks easily ■Navigation

30 The CLOUDS Protocol ■Automated assignment & structure determination ●Miguel Llinas, Alex Grishaev, et al. ●Spatial distribution of anonymous resonances generated with NOEs ■Integrated within CCPN ●An Analysis module ●Data Model glues modules ●Functional platform ●Distribution network Spectra Distance Constraints Proton Clouds Chain Assignment Protein Structure Spin Systems NOE intensities Pick Peaks & Normalise Pick Peaks, Link Shifts & Combine Relaxation Matrix Optimisation Hydrogen Atom Molecular Dynamics Chain Fitting & Molecular Replacement Full Structure Calculation

31 The CLOUDS Protocol A family of Clouds A fitted protein backbone

32 Other Features ■Works with FormatConverter ■Chemical compounds database ■NMR reference information ■Hard copy ●PostScript ●PDF ■Table export ■Rate analysis ■Macros ■Structures

33 CcpNmr Analysis Tutorial Part I CcpNmr Analysis Tutorial Part I

34 CCPN Future CCPN Future

35 Extend-NMR ■EU STREP application funded to fully integrate software from: ●Bruker (TOPSPIN, acquisition) ●Billeter, Orekhov (Garant, Munin, MDD) ●Kalbitzer (Auremol) ●Llinas (CLOUDS) ●Nilges (Inferential Structure Determination) ●Bonvin (Haddock, RECOORD) ●Vriend, Vuister (Queen, What-Check) ●Henrick, Vranken (NMR database) ■Focus on complexes and development of better software methodology

36 LIMS Collaborations ■PIMS project collaboration ●Protein production LIMS (with EBI, Sport Consortia, OPPF and Poupon) ■EU STREP application (SFGLIMS) to work with : ●Poupon (Protein Production) ●Perrakis (Biophysical methods, crystallisation) ●Bricogne (X-ray data collection and structure generation) ●Prilusky, Sussman (Bioinformatics, data mining)

37 Data Model Extensions ■EXTEND-NMR ●New NMR applications ■Solid state NMR ■PIMS ●LIMS for protein production ■SFGLIMS ●LIMS for NMR and X-ray structure determination ■X-ray ■Chemoinformatics ■(Metabolomics?)

38 Code Generation Plans ■C++/C/FORTRAN code ●Needed for Extend-NMR and for CcpNmr Processing ●Needed for interface to CYANA, NMRPIPE, AUTOPSY, etc. ■Java/Database code ●Extend for LIMS, high-throughput projects, NMRVIEW ■Basic Machinery ●Upgrades for long term extensibility/maintainability and performance

39 API Languages and Formats PythonJavaC++Perl XML SQL Analysis FormatConverter Bruker TopSpin NMRVIEWAzaraExtend-NMR NMRPIPE AUTOPSY (Varian)(CYANA)(Bioinformatics) MSD NMR database PIMS SFGLIMS (SFGLIMS) (bioinformatics) Format Language For all languages: Metamodel Documentation For all formats: Schemas I/O mappings

40 New Core API technology ■Reduce burden of adding new languages, formats ●Languages (Python, Java, C++, Perl) ●Storage formats (XML, SQL) Language & Format independent Format dependent only Language dependent only Language & Format dependent Code required for new language Code required for new format Most of the logic

41 Core API technology, cont. ■Remodelling of implementation details ●Storages, collection types, root objects, etc. ■Complex data types ●e.g. rotation matrix ■Client/Server architecture ●For PIMS and SFGLIMS

42 Analysis Development ■Beyond CLOUDS ●Large proteins, homologues ■Processing linked in ■Couplings (RDCs, TROSY), dihedral constraints ■Titrations (Ka, Kd) ■Chain states (alternate conformations) ■Solid State NMR ■Organic chemistry NMR (1D) ■Publication-ready diagrams and tables ■Windows version

43 Developments in Extend-NMR ■Integrated Bayesian, maximum entropy, … methods for data-processing, analysis and structure calculation ■‘Molecular replacement’ for NMR ■Further RECOORD development ■Databank for Experimental NMR spectra (DEN) ■MSD database analysis

44 Licenses ■GPL ●Data model ●Scripts which produce APIs ■LGPL ●Generic libraries ●Widget libraries ●Format Converter ■CCPN ●Analysis

45 Resources, 1 ■SourceForge: ●CVS repository for code ●API and FormatConverter releases ●http://sourceforge.net/projects/ccpn ■CCPN: ●Meetings, workshops ●API, FormatConverter and Analysis releases ●http://www.ccpn.ac.uk

46 Resources, 2 ■EBI: ●Format Converter ●Databases (MSD group) ●http://www.ebi.ac.uk/msd- srv/docs/NMR/NMRtoolkit/main.html ■JISCMAIL: ●Email list ●http://www.jiscmail.ac.uk/lists/ccpnmr.html ●(http://www.jiscmail.ac.uk/lists/nmrgen.html)

47 CcpNmr Analysis Tutorial Part II CcpNmr Analysis Tutorial Part II

48 CCPN at Göteborg: Day 2 ■ An overview of the data model ■ API Tutorial ■ Analysis Macros ■ Widgets and Popups

49 Major Data Model Packages Major Data Model Packages

50 CCPN Packages ■Groupings of related data ●e.g. NMR, X-ray, Molecular description ■Connections between packages ●e.g. NMR loads Nucleus (isotope) information ■Allows lazy loading ●Only load relevant data ●Only load when a link is queried ■Save only modified ■Reference packages ●Chemical compound, Reference chemical shifts Nmr ChemComp Molecule Nucleus Coordinates MolSystem People Sample

51 ChemElement

52 ChemElement - Details

53 Coordnates

54 Analysis

55 Implementation

56 Molecules and MolSystems ■Molecules ●Templates for specifying molecular connectivity. ●Sequences, chemical components, protonation state etc. ●A kind of reference, e.g. “Lysozyme” ■MolSystems ●Contain chains, which contain residues, which contain atoms. ●The objects you assign to. ●Built using molecule templates, e.g. a homo-oligomer is built using the same template to make different chains. ■Stored in different packages ●Molecule.xml, MolSystem.xml

57 MolSystem

58 Molecule

59 ChemComp

60 Experiment, Spectrum & Shift List Objects ■Experiment ●The set-up under particular conditions at a particular time, not a class of experiment. ■Spectrum ●Known as Data Source in the data model. A pointer to a chunk of data that results from an experiment. Several spectra may result from the same experiment if they are processed differently. ■Peak List ●A set of crosspeaks that have been picked for a spectrum. A spectrum can have several peak lists. The user can separate peaks into classes, e.g. picked in different ways. ■Shift List ●A set of chemical shifts, which are derived from peaks and may be linked to atoms. Valid for a set of experiments with similar conditions that give similar chemical shifts. Using different shift lists doesn’t change assignments, but it does change which peaks are used in the calculation of a shift value.

61 Nmr

62 Nmr.Peak

63 Resonances and Assignment Resonance Constraint Distance Dihedral Measurement Chemical Shift Relaxation Coupling Molecule Atoms Residues Chains Structure Co-ordinates Annotation Spin System Connectivity Residue Type Experiment Spectra Conditions Peak Dimensions ■Resonances ●The centre of the NMR data model ■Connect to peaks ●Different peaks may be caused by the same thing. ■Connect to atoms ●A connection to NMR equivalent atoms. Need not be set if anonymous. ■Have chemical shifts ●May have different shifts under different conditions.

64 Nmr.Resonance

65 NmrConstraints

66 Python API coding tutorial Python API coding tutorial

67 Development in the CCPN framework ■CcpNmr Macros ●Small home-use Python functions ■Additions to function library ●Functions incorporated in software release ●Community sharing ■Embedded options ●Extension to CcpNmr application ■Stand-alone applications ●Built on CCPN libraries and API ■CcpNmr Clouds has examples of all of these

68 The Python interface to the CCPN Data Model ■Find the number of assigned peaks in a spectrum count = 0 for peakList in spectrum.peakLists: for peak in peakList.peaks: for peak in peakList.peaks: for peakDim in peak.peakDims for peakDim in peak.peakDims if peakDim.peakDimContribs: if peakDim.peakDimContribs: count += 1 count += 1 break break ■Find all H-C partners in a residue pairs = [] for atom in residue.atoms: if atom.chemAtom.elementSymbol == ‘C’: if atom.chemAtom.elementSymbol == ‘C’: for bond in atom.chemAtom.chemBonds: for bond in atom.chemAtom.chemBonds: chemAtoms = list(bond.chemAtoms) chemAtoms = list(bond.chemAtoms) chemAtoms.remove(chemAtom) chemAtoms.remove(chemAtom) if chemAtoms[0].elementSymbol == ‘H’: if chemAtoms[0].elementSymbol == ‘H’: pairs append([atom, residue.findFirstAtom(chemAtom=chemAtom2))]) pairs append([atom, residue.findFirstAtom(chemAtom=chemAtom2))])

69 CcpNmr Analysis Macros ■Python scripts/functions ■Accessible from Analysis and embeddable ■Argument server ●An interface to the Analysis program ●Access to objects ■Selected peaks ■Cursor position ■Spectra ■Windows ■Etc… ■High-level function library ●Windows, Assignment, Molecules, Constraints ●Documented

70 Macro 1 - Simple stuff Python language Function anatomy Import library functions ArgumentServer Simple program def addMarksToPeaks(argServer, peaks=None): """Descrn: Adds position line markers to the selected peaks. Inputs: ArgumentServer, List of Nmr.Peaks Output: None """ from ccpnmr.analysis.MarkBasic import createPeakMark if not peaks: peaks = argServer.getCurrentPeaks() # no peaks - nothing happens for peak in peaks: createPeakMark(peak, remove=0)

71 Macro 2 - Ask the user def calcAveragePeakListIntensity(argServer, peakList=None, intensityType='height'): """Descrn: Find the average height of peaks in a peak list. Inputs: ArgumentServer, Nmr.PeakList Output: Float """ from ccpnmr.analysis.ConstraintBasic import getMeanPeakIntensity if not peakList: peakList = argServer.getPeakList() if not peakList: argServer.showWarning('No peak list selected') return answer = argServer.askYesNo('Use peak volumes? Height will be used otherwise.') if answer: # is true intensityType = 'volume' spec = peakList.dataSource expt = spec.experiment intensity = getMeanPeakIntensity(peakList.peaks, intensityType=intensityType) data = (intensityType,expt.name,spec.name,peakList.serial,intensity)) argServer.showInfo('Mean peak %s for %s %s peak list %d is %e' % data return intensity

72 Macro 3 - Popup loader def openMyPopup(argServer): """Descrn: Opens and example popup. Inputs: ArgumentServer Output: None """ peakList = argServer.getPeakList() popup = MyPopup(argServer.parent, peakList) from memops.gui.BasePopup import BasePopup from memops.gui.ButtonList import ButtonList from memops.gui.ScrolledGraph import ScrolledGraph from ccpnmr.analysis.PeakBasic import getPeakHeight, getPeakVolume

73 Macro 3 - The popup class MyPopup(BasePopup): def __init__(self, parent, peakList, *args, **kw): self.peakList = peakList self.colours = ['red', 'green'] self.dataSets = [] BasePopup.__init__(self, parent=parent, title='Test Popup', **kw) def body(self, guiParent): row = 0 self.graph = ScrolledGraph(guiParent) self.graph.grid(row=row, column=0, sticky='NSEW') row += 1 texts = ['Draw graph','Goodbye'] commands = [self.draw, self.destroy] buttons = ButtonList(guiParent, texts=texts, commands = commands) buttons.grid(row=row, column=0, sticky='NSEW') def draw(self): self.dataSets = self.getData() self.graph.update(self.dataSets, self.colours) def getData(self): peakData = [( getPeakVolume(peak) or 0.0, peak) for peak in self.peakList.peaks] peakData.sort() heights = [] volumes = [] i = 0 for volume, peak in peakData: heights.append([i, getPeakHeight(peak) or 0.0]) volumes.append([i, volume]) i += 1

74 CcpNmr Graphical Widgets ■ A library for any developer to use ColorListPulldownMenuScrolledMatrixLabelFrameCheckButtonButtonLabelEntryButtonList

75 CcpNmr Mega Widgets ■ Build them into your own code! ● ScrolledMatrix ● ScrolledGraph ● StructureFrame

76 Ccp Stand-Alone AppTemplate ■ Menu System ■ Project handling ● New ● Load ● Save ● Backup ■ Popup template ● Widgets ● Geometry ● Plumbing

77 Popup Constructors and Notifiers ■Init ●Setup local variables ●Subclass popup window ■Body ●Arrange Graphical elements ●Set up Data Model notifiers ●Set initial state ■Update ●Process updated values ●Redraw widgets based on status ■Widget callback ●From entry, buttons etc ●User functions ●Data Model change Body Update Notifiers Widgets Data Model External Influence Initialisation User Influence Update Filter

78 Aftercare ■www.ccpn.ac.uk ●Downloads ●Data Model documentation ●Analysis documentation ●Tutorials ■Mailing List ●http://www.jiscmail.ac.uk/lists/CCPNMR.html ●Quick response ●Bugs ●Requests


Download ppt "The CCPN Project Tim Stevens and Wayne Boucher October 2005."

Similar presentations


Ads by Google