eCrystals Federation: Open Repositories for Open Science

Slides:



Advertisements
Similar presentations
Preserv: Preservation architecture and interface A brief overview of ideas wrt to the project plan For Preserv partners meeting, BL, London, 18th November.
Advertisements

DRIVER Long Term Preservation for Enhanced Publications in the DRIVER Infrastructure 1 WePreserve Workshop, October 2008 Dale Peters, Scientific Technical.
Creating Institutional Repositories Stephen Pinfield.
AHM, Nottingham, September eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon.
A centre of expertise in data curation and preservation DCC/NeSC eScience Workshop, June 2008 Working in partnership with the eScience community This work.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
© S.J. Coles 2006 Usability WS, NeSC Jan 06 Enabling the reusability of scientific data: Experiences with designing an open access infrastructure for sharing.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. CLADDIER workshop.
CURRENT ISSUES Current contents Over 3,000 items open access, 42% reports and working papers, 21% journal articles, 21% conference items, 7% book chapters,
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
© S.J. Coles 2006 eCrystals: A Route for Open Access to Small Molecule Crystal Structure Data Simon Coles School of Chemistry, University of Southampton,
Supporting education and research Repositories in Context Digital repositories as components of an integrated infrastructure for education Leona Carpenter.
The PREMIS Data Dictionary Michael Day Digital Curation Centre UKOLN, University of Bath JORUM, JISC and DCC.
A centre of expertise in digital information management UKOLN is supported by: Adding Value to Data and Information: Moving towards a Science.
A centre of expertise in digital information management UKOLN is supported by: Dealing with Data: Perspectives on Progress to Date Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: Curating the Scientific Record: The Challenges Ahead Dr.
Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director.
A centre of expertise in digital information management UKOLN is supported by: Dealing with Data: Roles, Rights, Responsibilities & Relationships.
Integrating research data into the publication workflow: eBank UK experience Rachel Heery, UKOLN, University of Bath
Data Curation in Crystallography: Publisher Perspectives JISC Data Cluster Consultation Workshop CCLRC, Didcot, Oxon 10 October 2006.
UKOLN is supported by: eBank UK : linking research data, scholarly communications and learning. Dr Liz Lyon, UKOLN, University of Bath, UK JISC CNI Conference.
JISC Joint Programmes Meeting eBank UK : linking research data, learning and scholarly communications. Dr Liz Lyon, UKOLN, University of Bath Dr.
A centre of expertise in digital information management UKOLN is supported by: Digital repositories as research infrastructure: a UK perspective.
Digital | Curation | Centre An Introduction to the UK Digital Curation Centre Dr Liz Lyon, DCC Associate Director Outreach Director, UKOLN, University.
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
UKOLN is supported by: Emergent technologies & digitisation: the institutional impact. Liz Lyon & Kevin Edge VCs Retreat, October a.
A centre of expertise in digital information management UKOLN is supported by: Data Informatics Top Ten : (for Libraries) Dr Liz Lyon,
Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University.
Federation eCrystals Federation: Open Repositories for Open Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
A centre of expertise in digital information management UKOLN is supported by: UK Perspectives on the Curation and Preservation of Scientific.
Federation eCrystals Federation: Open Repositories for Data-driven Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
A centre of expertise in digital information management UKOLN is supported by: Digital Futures for MLAs? A snapshot in real time. Dr Liz.
A centre of expertise in digital information management UKOLN is supported by: UKOLN Update on Selected Activities Dr Liz Lyon, Director,
A centre of expertise in digital information management UKOLN is supported by: Memory institutions and the social fabric of the Web Dr.
UKOLN is supported by: JISC Information Environment update Repositories and Preservation Programme meeting, October 24-25, 2006 Rachel Heery UKOLN
UKOLN is supported by: Enhancing access to research data: the challenge of crystallography Rachel Heery, Monica Duke, Michael Day UKOLN, University of.
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
A centre of expertise in digital information management UKOLN is supported by: Open Science and the Research Library: Roles, Challenges.
Digital | Curation | Centre Supporting Digital Curation to safeguard research data: adding value today and ensuring long-term access Dr Liz Lyon, DCC Associate.
EBank UK CCLRC Workshop February eBank and CCLRC Workshop February 2005 University of Bath.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
A centre of expertise in data curation and preservation CETIS MDR SIG::28 June 2006::University of Bath Funded by: This work is licensed under the Creative.
The Central Role of Data ‘Capturing and Sharing Chemistry Research Data’ Simon Coles School of Chemistry, University of Southampton, U.K.
Data Requirements and Digital Repositories IASSIST Workshop Tampere, Finland 26 May, 2009.
University of Southampton, U.K.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
© S.J. Coles 2006 Data Management in the Chemistry Domain Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,
Supporting further and higher education The UK FAIR Programme: OAI in context Chris Awre OAI3, CERN, February 2004.
Implementing DOIs for Data DataPool supporting institutional service development Simon Coles, Andrew Milsted, Wendy White Jisc Managing Research Data Programme.
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
UKOLN is supported by: Introduction to UKOLN Dr Liz Lyon, Director UKOLN, University of Bath, UK Grand Challenge Meeting, June a centre.
Preservation metadata and the Cedars project Michael Day UKOLN: UK Office for Library and Information Networking University of Bath
CombeDay Making Data Openly Available Simon Coles.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
Cedars work on metadata Michael Day UKOLN, University of Bath Cedars Workshop Manchester, February 2002.
UKOLN is supported by: Library futures in the new research landscape. Dr Liz Lyon, UKOLN, University of Bath, UK CURL Members Meeting October 2004, London.
eCrystals Federation: Open Repositories for global Open Science
‘The eCrystals Federation’ Management and Publication of Small Molecule Structure Data for the Whole Crystallographic Community S.J. Colesa*, J.G. Freya,
JISC Joint Programmes Meeting 2005
Developing Institutional Data Repositories
eCrystals Federation: Open Repositories for global Open Science
Presentation transcript:

eCrystals Federation: Open Repositories for Open Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton, UK Dr Manjula Patel, UKOLN, University of Bath, UK CNI Taskforce Meeting, Washington DC, December 2007 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 http://creativecommons.org/licenses/by-sa/3.0/ Federation

Overview Chemistry and Open Science : context and practice. Lessons learnt from eBank Phase 3 Data curation and preservation issues Setting up the Federation: Challenges ahead?

Chemistry and Open Science: context and practice Federation 3

Social networks for chemists…. New postgraduate cohorts : millennials / Google generation : new behaviours

>8000 views Community content for chemists : rich media video + paper = Pubcast

At the coalface: tagging & sharing workflows Astronomy, Bioinformatics, Chemistry, Social Science pilots. Universities of Manchester & Southampton

“Small science” : sharing in the lab

Open Wetware Laboratory wikis

Transforming practice? 2006 Open Notebook Science (ONS) 26 September: 1st use of term blogged by Jean-Claude Bradley, Drexel University

2007 27 March: ONS at Amer Chem Society Symposium 7 August: ONS Poster in Second Life on Nature island 24 September: ONS Case Studies in Second Life 4 October: > 43,000 hits in Google for term ONS

10 & 15 October: Policy lists,DabbleDB membership database created US 11 October: ONS experiment starts in Cambridge, UK 7 November: Cameron Neylon (Univ Southampton / STFC, UK) posts “Sourceforge for Science” concept

10 November: Open Data for common molecules - Wikichemicals 10 November: Open Data for common molecules - Wikichemicals? Peter Murray-Rust’s blog at Univ. Cambridge, UK 27 November: Research Network proposal submitted to UK research council Yesterday: about 2,400,000 Google hits for Open Notebook Science New ideas are surfacing very fast with instant development, testing and take-up…..

eBank Project – building the eCrystals Data Repository eBank-UK Phase 1 2003 Institutional Repository exemplar http://ecrystals.chem.soton.ac.uk 13

Metadata Publication Crystal structure Title (Systematic IUPAC Name) Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier (InChI) Compound Class & Keywords Specifies which ‘datasets’ are present in an entry DOI http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145 Rights & Citation http://ecrystals.chem.soton.ac.uk/rights.html Application Profile http://www.ukoln.ac.uk/projects/ebank-uk/schemas/ 14

wikis blogs Publish Harvest

Lessons learnt from eBank Phase 3 Federation 16

Study Aims and Approach Scoping the eCrystals Federation of crystallography data repositories Questionnaire and interview-based Joint Consultation Workshop (eBank, R4L, SPECTRa) & Report Engage whole data lifecycle community – crystallographers, central facilities, publishers, data centres, and chemical information specialists. Mixed project team: Chemists, Digital Library researchers & Computer Scientists 17

Lessons: Policy and practice Must be considered at level of the Institution and the practising Laboratory Mixed lab practice – central service facility versus single “staff crystallographer” in department “Repository Lite” for smaller lab operations? Established data ‘publication’ practice + domain subject repository: Cambridge Crystallographic Data Centre (CCDC) Institutional policy buy-in is essential Demonstrate benefits and added value to senior managers Implications for information services structure 18

Interoperability & Standards Instrument manufacturers proprietary formats Technical software platform Metadata schema : Application profiles Standards and identifiers – International Chemical Identifier (InChI), DOI, CIF, CML, de facto software Semantic interoperability X-ray diffractometers 19

Subject Repositories, Publishing and IPR Established subject repository at CCDC (40 years old!) : repository interactions? The “embargo problem” : prior dissemination affecting publication of journal article Cultural issues related to chemists “its my data” (journal article will always be sacred) Mechanisms for sharing with collaborators and referees prior to publication? 20

Advocacy The most important issue?!? 21

Data curation and preservation issues Federation 22

Digital Curation Centre http://www.dcc.ac.uk/ Community Development work Led by UKOLN eBank/eCrystals partner

eBank-UK Phase 3 Curation & Preservation Study http://www.ukoln.ac.uk/projects/ebank-uk/curation/ Examined four main areas Audit and certification (TRAC, DRAMBORA, NESTOR, ISO International repository audit and certification BOF Group) The Open Archival Information System (OAIS) and Representation Information (RI) eBank-UK application profile and preservation metadata ePrints.org repository platform

Observations & Recommendations 1 Self-assessment using DRAMBORA toolkit Engage DCC audit & certification team Formulation of long-term objectives and policy Deposit agreements Services Aim for community-supported sustainability plan Implement regular audits: annual Produce evidence of compliance Documentation Transparency Adequacy Measurability Federation context

Observations & Recommendations 2 Maintenance and open access of critical file formats and software Work-up software e.g. XPREP Export raw data from instrumentation as imgCIF Consider Representation Information (RI) in context of whole crystallography landscape (CCDC, IUCR etc.) Develop a preservation and curation strategy and formal policies to indicate levels of service Deposit, ingest, validation, dissemination Consider services to be developed over the DCC Registry/Repository of Representation Information (RRoRI)

Observations & Recommendations 3 Develop preservation strategy & plan for the specific content Capture preservation metadata, including versioning and provenance information PREMIS Data Dictionary Semantic Units (e.g. file format, significant properties, provenance, fixity info) Extend eBank metadata application profile (AP)? Obtain consensus on AP Seek to automate metadata generation, extraction, maintenance ePrints.org support for information packages

Setting up the Federation: Challenges ahead? 28

Curate Preserve Standards Funder Data centres / aggregator services Advisory Scientist CreateDeposit Scientist IR Federation Curate Preserve Standards Policy AdvocacyTraining Harvest Collaborate Share Link Discover Re-use Link Publishers User eCrystals Federation Data Deposit Model Link

Repository deployment & support Roll-out in 2 phases Universities Sydney, Glasgow, Newcastle with eprints.org platform Universities Cambridge, STFC, ReciprocalNet, ARCHER with other platforms Information Environment Service Registry (IESR) listing Federation Collections

Laboratory Workflow & Provenance Achieving end-to-end workflows: avoiding fragmentation of data, results and interpretations Account for differing laboratory practice Raw Data Public domain material RAW DATA DERIVED DATA RESULTS DATA 31

Repository interoperability & linking services Establish core Federation application profile and mappings Bi-directional links with derived articles in “publisher repositories”, IUCr, Royal Society of Chemistry (RSC), Chemistry Central Test linking options: StORe middleware and CLADDIER (JISC-funded projects) OAI-ORE Pathways Project developments

Interoperability testbed Experimental data sets + metadata as compound objects Dublin Core and METS not sufficient OAI-ORE (base: Atom Publishing Protocol) testbed Enable 3rd party services e.g. data / text mining eChemistry project 33

Enabling data discovery Royal Society of Chemistry Project Prospect tagging & semantic linking 34

Preservation & Sustainability DRAMBORA Assessment : use DRAMBORA Interactive Enhance Application Profile with PREMIS preservation metadata Populate RRoRI with crystallography representation information Examine repository platform conformance to OAIS Ref Model Survey partner institutional preservation policies

Embedding into current publishing practice Chemists still want to publish scholarly articles Blogs and repositories are a new form of rapid communication, but there are prior publication concerns Timing of release of data into public domain and formal publication will be crucial Repository must provide control over timing of public visibility EPrints3 version of eCrystals has ‘embargo tokens’ Validation and quality in an ‘Open’ world Quality indicators? 36

Advocacy Chemists still wary of ‘Open Access’ eCrystals Roadshow Workshops engaging both crystallographers and their service ‘users’ in the workplace Open forum at International Union of Crystallography world congress (Aug 2008) Publishers Workshop to demonstrate co-existence of open data models & traditional scholarly articles 37

Questions? Slides will be available at : http://wiki.ecrystals.chem.soton.ac.uk/index.php This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 http://creativecommons.org/licenses/by-sa/3.0/ Federation 38