Presentation is loading. Please wait.

Presentation is loading. Please wait.

eCrystals Federation: Open Repositories for Open Science

Similar presentations


Presentation on theme: "eCrystals Federation: Open Repositories for Open Science"— Presentation transcript:

1 eCrystals Federation: Open Repositories for Open Science
Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton, UK Dr Manjula Patel, UKOLN, University of Bath, UK CNI Taskforce Meeting, Washington DC, December 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 Federation

2 Overview Chemistry and Open Science : context and practice.
Lessons learnt from eBank Phase 3 Data curation and preservation issues Setting up the Federation: Challenges ahead?

3 Chemistry and Open Science: context and practice
Federation 3

4 Social networks for chemists….
New postgraduate cohorts : millennials / Google generation : new behaviours

5 >8000 views Community content for chemists : rich media video + paper = Pubcast

6 At the coalface: tagging & sharing workflows
Astronomy, Bioinformatics, Chemistry, Social Science pilots. Universities of Manchester & Southampton

7 “Small science” : sharing in the lab

8 Open Wetware Laboratory wikis

9 Transforming practice?
2006 Open Notebook Science (ONS) 26 September: 1st use of term blogged by Jean-Claude Bradley, Drexel University

10 2007 27 March: ONS at Amer Chem Society Symposium 7 August: ONS Poster in Second Life on Nature island 24 September: ONS Case Studies in Second Life 4 October: > 43,000 hits in Google for term ONS

11 10 & 15 October: Policy lists,DabbleDB membership database created US
11 October: ONS experiment starts in Cambridge, UK 7 November: Cameron Neylon (Univ Southampton / STFC, UK) posts “Sourceforge for Science” concept

12 10 November: Open Data for common molecules - Wikichemicals
10 November: Open Data for common molecules - Wikichemicals? Peter Murray-Rust’s blog at Univ. Cambridge, UK 27 November: Research Network proposal submitted to UK research council Yesterday: about 2,400,000 Google hits for Open Notebook Science New ideas are surfacing very fast with instant development, testing and take-up…..

13 eBank Project – building the eCrystals Data Repository
eBank-UK Phase Institutional Repository exemplar 13

14 Metadata Publication Crystal structure Title (Systematic IUPAC Name)
Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier (InChI) Compound Class & Keywords Specifies which ‘datasets’ are present in an entry DOI Rights & Citation Application Profile 14

15 wikis blogs Publish Harvest

16 Lessons learnt from eBank Phase 3
Federation 16

17 Study Aims and Approach
Scoping the eCrystals Federation of crystallography data repositories Questionnaire and interview-based Joint Consultation Workshop (eBank, R4L, SPECTRa) & Report Engage whole data lifecycle community – crystallographers, central facilities, publishers, data centres, and chemical information specialists. Mixed project team: Chemists, Digital Library researchers & Computer Scientists 17

18 Lessons: Policy and practice
Must be considered at level of the Institution and the practising Laboratory Mixed lab practice – central service facility versus single “staff crystallographer” in department “Repository Lite” for smaller lab operations? Established data ‘publication’ practice + domain subject repository: Cambridge Crystallographic Data Centre (CCDC) Institutional policy buy-in is essential Demonstrate benefits and added value to senior managers Implications for information services structure 18

19 Interoperability & Standards
Instrument manufacturers proprietary formats Technical software platform Metadata schema : Application profiles Standards and identifiers – International Chemical Identifier (InChI), DOI, CIF, CML, de facto software Semantic interoperability X-ray diffractometers 19

20 Subject Repositories, Publishing and IPR
Established subject repository at CCDC (40 years old!) : repository interactions? The “embargo problem” : prior dissemination affecting publication of journal article Cultural issues related to chemists “its my data” (journal article will always be sacred) Mechanisms for sharing with collaborators and referees prior to publication? 20

21 Advocacy The most important issue?!? 21

22 Data curation and preservation issues
Federation 22

23 Digital Curation Centre http://www.dcc.ac.uk/
Community Development work Led by UKOLN eBank/eCrystals partner

24 eBank-UK Phase 3 Curation & Preservation Study
Examined four main areas Audit and certification (TRAC, DRAMBORA, NESTOR, ISO International repository audit and certification BOF Group) The Open Archival Information System (OAIS) and Representation Information (RI) eBank-UK application profile and preservation metadata ePrints.org repository platform

25 Observations & Recommendations 1
Self-assessment using DRAMBORA toolkit Engage DCC audit & certification team Formulation of long-term objectives and policy Deposit agreements Services Aim for community-supported sustainability plan Implement regular audits: annual Produce evidence of compliance Documentation Transparency Adequacy Measurability Federation context

26 Observations & Recommendations 2
Maintenance and open access of critical file formats and software Work-up software e.g. XPREP Export raw data from instrumentation as imgCIF Consider Representation Information (RI) in context of whole crystallography landscape (CCDC, IUCR etc.) Develop a preservation and curation strategy and formal policies to indicate levels of service Deposit, ingest, validation, dissemination Consider services to be developed over the DCC Registry/Repository of Representation Information (RRoRI)

27 Observations & Recommendations 3
Develop preservation strategy & plan for the specific content Capture preservation metadata, including versioning and provenance information PREMIS Data Dictionary Semantic Units (e.g. file format, significant properties, provenance, fixity info) Extend eBank metadata application profile (AP)? Obtain consensus on AP Seek to automate metadata generation, extraction, maintenance ePrints.org support for information packages

28 Setting up the Federation: Challenges ahead?
28

29 Curate Preserve Standards
Funder Data centres / aggregator services Advisory Scientist CreateDeposit Scientist IR Federation Curate Preserve Standards Policy AdvocacyTraining Harvest Collaborate Share Link Discover Re-use Link Publishers User eCrystals Federation Data Deposit Model Link

30 Repository deployment & support
Roll-out in 2 phases Universities Sydney, Glasgow, Newcastle with eprints.org platform Universities Cambridge, STFC, ReciprocalNet, ARCHER with other platforms Information Environment Service Registry (IESR) listing Federation Collections

31 Laboratory Workflow & Provenance
Achieving end-to-end workflows: avoiding fragmentation of data, results and interpretations Account for differing laboratory practice Raw Data Public domain material RAW DATA DERIVED DATA RESULTS DATA 31

32 Repository interoperability & linking services
Establish core Federation application profile and mappings Bi-directional links with derived articles in “publisher repositories”, IUCr, Royal Society of Chemistry (RSC), Chemistry Central Test linking options: StORe middleware and CLADDIER (JISC-funded projects) OAI-ORE Pathways Project developments

33 Interoperability testbed
Experimental data sets + metadata as compound objects Dublin Core and METS not sufficient OAI-ORE (base: Atom Publishing Protocol) testbed Enable 3rd party services e.g. data / text mining eChemistry project 33

34 Enabling data discovery
Royal Society of Chemistry Project Prospect tagging & semantic linking 34

35 Preservation & Sustainability
DRAMBORA Assessment : use DRAMBORA Interactive Enhance Application Profile with PREMIS preservation metadata Populate RRoRI with crystallography representation information Examine repository platform conformance to OAIS Ref Model Survey partner institutional preservation policies

36 Embedding into current publishing practice
Chemists still want to publish scholarly articles Blogs and repositories are a new form of rapid communication, but there are prior publication concerns Timing of release of data into public domain and formal publication will be crucial Repository must provide control over timing of public visibility EPrints3 version of eCrystals has ‘embargo tokens’ Validation and quality in an ‘Open’ world Quality indicators? 36

37 Advocacy Chemists still wary of ‘Open Access’
eCrystals Roadshow Workshops engaging both crystallographers and their service ‘users’ in the workplace Open forum at International Union of Crystallography world congress (Aug 2008) Publishers Workshop to demonstrate co-existence of open data models & traditional scholarly articles 37

38 Questions? Slides will be available at : This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 Federation 38


Download ppt "eCrystals Federation: Open Repositories for Open Science"

Similar presentations


Ads by Google