‘The eCrystals Federation’ Management and Publication of Small Molecule Structure Data for the Whole Crystallographic Community S.J. Colesa*, J.G. Freya,

Slides:



Advertisements
Similar presentations
Partnering with Faculty / researchers to Enhance Scholarly Communication Caroline Mutwiri.
Advertisements

Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
Creating Institutional Repositories Stephen Pinfield.
Metadata workshop, June The Workshop Workshop Timetable introduction to the Go-Geo! project metadata overview Go-Geo! portal hands on session.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
© S.J. Coles 2006 Usability WS, NeSC Jan 06 Enabling the reusability of scientific data: Experiences with designing an open access infrastructure for sharing.
Crystal Structure EPrints: Source Through the Open Archive Initiative S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. CLADDIER workshop.
Data and metadata in the Reciprocal Net John C. Bollinger Indiana University Molecular Structure Center, Bloomington, IN.
Data and Publication Discovery Brian Matthews, Information Management Group, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton,
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
© S.J. Coles 2006 eCrystals: A Route for Open Access to Small Molecule Crystal Structure Data Simon Coles School of Chemistry, University of Southampton,
Integrating research data into the publication workflow: eBank UK experience Rachel Heery, UKOLN, University of Bath
Data Curation in Crystallography: Publisher Perspectives JISC Data Cluster Consultation Workshop CCLRC, Didcot, Oxon 10 October 2006.
Federation eCrystals Federation: Open Repositories for Data-driven Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
UKOLN is supported by: Enhancing access to research data: the challenge of crystallography Rachel Heery, Monica Duke, Michael Day UKOLN, University of.
Publisher perspective eBank/R4L/SPECTRa Joint Consultation Workshop London Metropole Hotel 20 October 2006.
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
EBank UK CCLRC Workshop February eBank and CCLRC Workshop February 2005 University of Bath.
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
Supporting Engagement in Open Access: a Publishers Perspective
PubMed Central ANCHASL Spring Meeting April 1, 2005 Robert James Associate Director of Public Services Duke University.
The Central Role of Data ‘Capturing and Sharing Chemistry Research Data’ Simon Coles School of Chemistry, University of Southampton, U.K.
University of Southampton, U.K.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
Crystallographic Data Publication at Source International Union of Crystallography Peter R. Strickland and Brian McMahon IUCr 5 Abbey Square Chester CH1.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
© S.J. Coles 2006 Data Management in the Chemistry Domain Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,
UCL LIBRARY SERVICES The work of UNICA in the context of new modes of publication and dissemination Dr Paul Ayris Chair, UNICA Scholarly Communications.
Chinese-European Workshop on Digital Preservation, Beijing July 14 – Network of Expertise in Digital Preservation 1 Trusted Digital Repositories,
A Lightweight Approach To Support of Resource Discovery Standards The Problem Dublin Core is an international standard for resource discovery metadata.
Practical Advice Morag Greig Advocacy William J Nixon Service Development DAEDALUS Workshop – 27 June 2005.
Digital/Open Access repositories Paul Sheehan Director of Library Services DCU HEAnet National Networking Conference Athlone 11 th November 2005.
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
Publishing & Citing Research Data Arun Prakash. Agenda  Introduction  Why is Data publishing important ?  Ongoing Work  Role of Semantics.
CombeDay Making Data Openly Available Simon Coles.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
REF: Open access requirements Directorate of Academic Support December 2015.
UKOLN is supported by: Library futures in the new research landscape. Dr Liz Lyon, UKOLN, University of Bath, UK CURL Members Meeting October 2004, London.
PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA …………………………………………………………………………………………………… LOUISE CORTI …………………….…………………………….… UK DATA ARCHIVE.
A centre of expertise in digital information management 10 minute practical guide to the JISC Information Environment (for publishers!)
NRF Open Access Statement
Ian Bruno, Suzanna Ward The Cambridge Crystallographic Data Centre
EPSRC research data expectations and research software management
Moving on : Repository Services after the RAE
OceanDocs Digital Repository of Marine Science Research Outputs
An Overview of Data-PASS Shared Catalog
IRUS-UK and ORCIDs Paul Needham Cultivating ORCID: Encouraging growth
eCrystals Federation: Open Repositories for global Open Science
Linking persistent identifiers at the British Library
Sophia Lafferty-hess | research data manager
Introducing da|raSearchNet
Introduction to Research Data Management
Disseminating Service Registry Records
JISC Joint Programmes Meeting 2005
JISC Information Environment Service Registry (IESR)
Developing Institutional Data Repositories
eCrystals Federation: Open Repositories for global Open Science
Presentation transcript:

‘The eCrystals Federation’ Management and Publication of Small Molecule Structure Data for the Whole Crystallographic Community S.J. Colesa*, J.G. Freya, M.B. Hursthousea, A.J. Milsted, L. Carrb, M. Dukec, T. Kochc & E. Lyonc. aSchool of Chemistry; bSchool of Electronics and Computer Science, University of Southampton, UK.; cUKOLN, University of Bath, UK. The Publication Problem The Institutional Data Archive The UK funding councils recently stated that ‘the data underpinning the published results of publically-funded research should be made available as widely and rapidly as possible’. Thirty years ago a research student would present about five crystal structures as their PhD thesis, however with modern technologies and good crystals this can now be achieved in the timespan of a single morning. This increase in pace of generation further exacerbates a problem in the communication of the results. Additionally, the general route for the publication of a crystal structure report is coupled with and often governed by the underlying chemistry and is therefore subject to the lengthy peer review process and tied to the timing of the publication as a whole. This bottleneck in the dissemination of crystal structure data hinders the potential growth of databases (just 500,000 small unit cell crystal structures are available in the CSD, ICSD & CRYSMET, while it is estimated that at least three times this number have been determined in laboratories worldwide). In addition, publication in the mainstream literature still offers only indirect (and often subscription controlled) access to this data. The eBank-UK (http://www.ukoln.ac.uk/projects/ebank-uk/) project has addressed the publication problem by establishing an institutional data archive. On one hand this archive is capable of supporting and managing ALL the digital files generated during the course of a crystallographic experiment. On the other hand it is capable of acting as a dissemination tool, by making metadata relating to these crystallographic datasets available to the public domain. This process alters the traditional method of peer review by openly providing crystal structure data, where the reader or user may directly check correctness and validity. The repository (http://ecrystals.chem.soton.ac.uk) makes available all the raw, derived and results data from a crystallographic experiment with little further researcher effort after the creation of a normal completed structure in a laboratory archive. Not only does this approach allow rapid release of crystal structure data into the public domain, but it also provides a mechanism for the construction of value added services that allow rapid discovery of the data for further studies and reuse, whilst ownership of the data is retained by the creator. Access to ALL the data Getting data into the archive The archive makes available all results data, including a CML file and a CIF accompanied by a CHECKCIF validation output. Important files generated during the experiment are also provided, e.g. final refinement listings, details of all scans and corrections performed and precession photographs. Thus COMPLETE details of all the steps undertaken during the analysis are provided and anyone wishing to reuse the structure can fully assess its validity. The archive is configured to recognise all the files generated during the course of a crystallographic experiment. It is also necessary to perform numerous operations and file format conversions to generate an archive compliant dataset. A ‘toolbox’ has been created which seamlessly performs these operations to ensure that minimal human error is introduced. Further metadata are associated with the dataset during the deposition process by means of a simple interface. Data aggregation services Publicising and harvesting content Information providers may regularly probe the archive interface for new or updated entries and download the associated metadata. These services can then ‘aggregate’ the metadata, -that is perform linking and cross referencing exercises that enable the researcher to move navigate seamlessly through the literature. Metadata relating to the dataset are made available to a public interface via a digital libraries protocol (OAI-PMH) that enables third parties to ‘harvest’ information on the content of the archive. Primary bibliographic data e.g. title (IUPAC name), authors & affiliation, in addition to chemical metadata e.g. International Chemical Identifier (InChI), empirical formula, compound class & keywords are provided. The dataset is registered with a persistent identifier (DOI) which enables the generation of a permanent citation. The OAI also states which aspects of the experimental process contain files, so that a harvester may assess whether an entry is appropriate for the desired purpose. Data-based aggregators may discover relevant data and download it for indexing and inclusion in their collections. Additionally subject or academic literature based services may use the harvested metadata to associate a dataset with other relevant works in the literature. The Future: Institutional Support, Further Deployments & Third Party Services eCrystals has been devised as a part of a project addressing the challenge of whole-lifecycle use of data, by investigating the role of aggregator services in linking datasets to peer reviewed articles. UKOLN (University of Bath) and the eCrystals team have designed a prototype service based on metadata harvested from the archive and aggregated it with the primary crystallographic literature (IUCr journals). Future work in this area will focus on enabling data-based harvesters to automatically harvest datasets so that eCrystals entries can be indexed and incorporated into subject specific databases (e.g. CSD). The prototype service will be developed further to provide a mechanism to aggregate datasets with the broader chemical literature and other bodies of publicly available chemical information. Current developments include securing backing from host institutions and we are in the final stages of making an agreement with the University of Southampton to support this archive as part of its Institutional Repository scheme, hosted by our Library and Information Services department. In addition further installations in other institutions are planned so that a ‘federation’ of archives can be constructed, which will enrich the content of third party aggregator services and promote their development. The authors would like to acknowledge and thank the Joint Information Systems Committee (JISC) for funding this project.