Presentation is loading. Please wait.

Presentation is loading. Please wait.

© S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service.

Similar presentations


Presentation on theme: "© S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service."— Presentation transcript:

1 © S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service School of Chemistry University of Southampton

2 © S.J. Coles 2005 ACS 2005, San Diego Data – Information – Knowledge Cycle Experiment PredictionModel Properties

3 © S.J. Coles 2005 ACS 2005, San Diego Leveraging eScience X-Ray e-Lab Analysis Properties e-Lab Simulation Video Diffractometer Grid Middleware Structures Database

4 © S.J. Coles 2005 ACS 2005, San Diego Data ‘Acquisition’ and ‘Workup’ 1)Application for an allocation 2)Secure access to NCS Grid resources 3)Sample submission 4)Monitoring sample status 5)Data collection 6)Raw data download 7)Automated structure solution

5 © S.J. Coles 2005 ACS 2005, San Diego Application * * * * * * *

6 © S.J. Coles 2005 ACS 2005, San Diego Security NCS RA KEYSTORE Applicant identity independently verified by NCS Panel award access to NCS CLIENT CSR NCS RA signs key pair NCS RA public key NCS RA exports signed certificate Passcode & signed PFX Signed certificate imported into browser

7 © S.J. Coles 2005 ACS 2005, San Diego Sample Submission

8 © S.J. Coles 2005 ACS 2005, San Diego Status Monitoring NCS CLIENT

9 © S.J. Coles 2005 ACS 2005, San Diego Data Collection Diffraction Unit Cell Success Strategy Data Collection Data Process System Y PreScans Yes BruNo Mount BruNo Unmount Setup via GUI Sample Tray No

10 © S.J. Coles 2005 ACS 2005, San Diego Data Collection Metadata capture

11 © S.J. Coles 2005 ACS 2005, San Diego Data Collection

12 © S.J. Coles 2005 ACS 2005, San Diego Automatic Structure Solution  Background process designed to adopt the ‘Human Approach’, using refinement indicators and structural knowledge  Encorporates all ‘Q peaks’ above a cut-off as C atoms  Reject on basis of thermal parameters, adjust atom types accordingly & iterate  Hybridisation & hydrogens from connectivity & difference map peaks then fixed  Usual crystallographic validation performed, -introducing ‘chemical validation’

13 © S.J. Coles 2005 ACS 2005, San Diego Data Overload & the Publication Problem 25,000,000 2,000,000 300,000

14 © S.J. Coles 2005 ACS 2005, San Diego Current Publishing Protocols Aims, intellectual ideas, conclusions Inferences, interpretation, derived results Raw & underlying data

15 © S.J. Coles 2005 ACS 2005, San Diego The Open Archive Solution?

16 © S.J. Coles 2005 ACS 2005, San Diego Separating Data from Interpretations Underlying data Intellect & Interpretation

17 © S.J. Coles 2005 ACS 2005, San Diego The Open Archive Solution for Data Research & e-Science workflows Aggregator services: national, commercial Repositories : institutional, e-prints, subject, data, learning objects Data curation: databases & databanks Validation Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Searching, harvesting, embedding Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Linking

18 © S.J. Coles 2005 ACS 2005, San Diego Workflow RAW DATADERIVED DATARESULTS DATA Initialisation: mount new sample on diffractometer & set up data collection Collection: collect data Processing: process and correct images Solution: solve structures Refinement: refine structure CIF: produce CIF (Crystallographic Information File) Validation: chemical & crystallographic checks Report: generate Crystal Structure Report

19 © S.J. Coles 2005 ACS 2005, San Diego Simple Deposition Metadata ‘attached’

20 © S.J. Coles 2005 ACS 2005, San Diego An Archive Entry ecrystals.chem.soton.ac.uk

21 © S.J. Coles 2005 ACS 2005, San Diego Access to ALL underlying data

22 © S.J. Coles 2005 ACS 2005, San Diego Metadata Publication Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier (InChI) Compound Class Keywords Specifies which ‘datasets’ are present in an entry DOI Rights

23 © S.J. Coles 2005 ACS 2005, San Diego Harvesting & Aggregating: Google

24 © S.J. Coles 2005 ACS 2005, San Diego OAI Harvesting & Aggregating OAIster: Generic

25 © S.J. Coles 2005 ACS 2005, San Diego OAI Harvesting & Aggregating eBank: Subject Specific

26 © S.J. Coles 2005 ACS 2005, San Diego OAI Harvesting & Aggregating PSIgate: Service Provider

27 © S.J. Coles 2005 ACS 2005, San Diego ‘Value Added’ Studies Courtesy: Thomas Gelbrich

28 © S.J. Coles 2005 ACS 2005, San Diego ‘Value Added’ Studies

29 © S.J. Coles 2005 ACS 2005, San Diego ‘Value added’ studies X~I X~CF 3 CH 3 ~CF 3 (X = CF 3, I, Br, Cl, F, H) Br~Br (iii) I~Cl I~Br (ii) I~I (iii) CN~Br (i) CN~CN C2C3 I-Dimer C1 Br~Br(ii)Br~Br (i) I~Br CF 3 ~Cl 1D1D 0D 2D 3D

30 © S.J. Coles 2005 ACS 2005, San Diego ‘Value Added’ Studies Five structures based on C1 stacks

31 © S.J. Coles 2005 ACS 2005, San Diego Thanks NCS: Mike Hursthouse, Mark Light, Peter Horton, Ann Bingham CombeChem: Jeremy Frey, Sam Peppe, Paul Walker IT Innovation: Mike Surridge, Ken Meacham, Steve Taylor, Darren Marvin ECS: Dave de Roure, Hugo Mills, Graham Smith, Les Carr, Chris Gutteridge eBank / UKOLN / PSIgate: Liz Lyon, Rachel Heery, Monica Duke, Michael Day, Andy Powell, John Blundon-Ellis ££££($$$$)’s

32 © S.J. Coles 2005 ACS 2005, San Diego Take-Home Message “The internet wasn't created for mockery! It was created so scientists from different universities could share datasets....” Simpson, H. The Simpsons (2005), Eds. Groening, M., Brooks, J.L. & Simon, S., Series 16, Episode 8, Original air date (US) 06-Feb-2005. http://www.tvtome.com/tvtome/servlet/GuidePageServlet/showid-146/epid-346864/


Download ppt "© S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service."

Similar presentations


Ads by Google