Presentation is loading. Please wait.

Presentation is loading. Please wait.

Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University.

Similar presentations


Presentation on theme: "Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University."— Presentation transcript:

1 Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University of Southampton, April 2008 This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 http://creativecommons.org/licenses/by-sa/3.0/

2 Themes 1.Context: Open science, institutional data repositories crystallography exemplar 2.Scale: repository federations 3.Integration: Lab workflow and semantic challenges 4.Longevity: Digital curation, preservation and sustainability 5.Community: DCC Data Forum

3 Open Science ….is happening now Blogging of results data Open grant proposals Community repositories for data Open Notebook Science (ONS) tutorials in Second Life

4 eBank Project – building the eCrystals Data Repository ePrints platform @ Southampton Institutional Repository exemplar Embedded in workflow http://ecrystals.chem.soton.ac.uk Started Sept 2003 Scholarly knowledge cycle context UKOLN-led interdisciplinary team

5 Scaling Up Report Interviews & analysis of a discipline: crystallography Synthesis: IR Policy & Practice, Laboratory Practice & Workflows, Technical Interoperability & Standards, Metadata Schema & Application Profiles, Semantic Interoperability, Data Citation, Identifiers & Linking, Federation Architectures & Third Party Services, Rights & Licensing, Data Quality & Validation, Preservation, Curation & Sustainability Recommendations, commentary

6 Scaling Up Report Phase 3 findings: Diverse lab practice LIMS and proprietary formats Data policy should reflect lab practice & institutional model Data quality criteria/validation Prior publication problem We need scalable assignment of terms for data discovery No discipline preservation model

7 nλ = 2 d sinθ The

8 eCrystals Repository ePrints.org v3.0

9 Repository Foundations Using simple Dublin Core Crystal structure Title (Systematic IUPAC Name) Authors Affiliation Creation Date Additional chemical information through Qualified Dublin Core Empirical formula International Chemical Identifier (InChI) Compound Class & Keywords Specifies which datasets are present in an entry Application Profile http://www.ukoln.ac.uk/projects/ebank-uk/schemas/ DOI links http://dx.doi.org/10.1594/ecrystals.chem.soton.ac.uk/145 Rights & Citation http://ecrystals.chem.soton.ac.uk/rights.html Learned society + subject repository support

10 Federation interoperability & linking services Roll-out in 2 phases led by University of Southampton Establish Federation policies, application profile, mappings Bi-directional links with derived articles in publisher repositories, IUCr, Royal Society of Chemistry (RSC), Chemistry Central: scholarly knowledge cycle StOReLink project - Test linking options: StORe middleware and CLADDIER OAI-ORE Testbed eChemistry project

11 Validation and Reproducibility We need to: Provide accurate data and information that will allow an experiment to be reproduced Record the provenance of a dataset Provide an audit trail from workflow capture Relate components of a dataset to steps in the workflow Share the workflow and record of an experiment Provide automated approaches to validation

12 Laboratory practice & workflow Community standard CIF Mixed lab practice – central service facility versus single staff crystallographer in department Achieve end-to-end workflow Lack of integration with LIMS Instrument manufacturers with proprietary formats Repository Lite for smaller lab operations? X-ray diffractometers

13 Crystallographic schema underpins CIF (Crystallographic Information Framework), but is limited to data parameters e.g. cell_length_a Semantic issues

14 IUCr Acta Cryst 1992 Limited set of keywords describing methods, properties & applications, compounds, attributes No established crystallography dictionary or controlled vocabulary to give chemistry context

15 Federation=Repository 2.0? Facilitate interaction and participation beyond conventional disciplinary boundaries Multi-disciplinary search and browse functionality Support tags, terms, comments, ratings….. Automatic tag / term validation & enhancement Develop domain semantics / vocabulary Use domain-specific authority files Facilitate and improve automated indexing Link data to all associated digital objects / people Apply across a heterogeneous Federation Mine to discover and innovate rather than (just) find

16 Challenges? How are tags, terms, comments, ratings assigned? Informal tags and/or structured KOS? How is a vocabulary curated and maintained? Can a vocabulary be transformed into a (Semantic Web related understanding) ontology? Disambiguation, acronyms, IUPAC names Persistent identification for data citation Granularity of data citation: dataset or value? Advocacy: becoming part of the lab culture

17 eCrystals Curation & Preservation Study Working with the Digital Curation Centre Examined four main areas 1.Audit and certification (TRAC, DRAMBORA, NESTOR, ISO International repository audit and certification BOF Group) 2.The Open Archival Information System (OAIS) and Representation Information (RI) 3.eBank-UK application profile and preservation metadata 4.ePrints.org repository platform Recommendations http://www.ukoln.ac.uk/projects/ebank- uk/curation/eBank3-WP4-Report%20(Revised).pdf

18 eCrystals Federation: Preservation & sustainability Recommendations Data repositories Use DRAMBORA Interactive for self- assessment Add PREMIS preservation metadata Collect eCrystals representation information Examine repository platform conformance to OAIS Reference Model Survey partner preservation policies Digital Curation Centre partnership

19 Dealing with Data Report DataSets Mapping and Gap Analysis (UK) Data Curation & Preservation Strategy (UK) Data Audit Framework (HE Institutions) Institutional Data Management, Preservation & Sharing Policy Data Management & Sharing Policy (Funders) Data Management Plan (Projects) Rec 5 : Data Networking Forum (People) linked to RIN Framework Principle 1

20 Inaugural Research Data Forum 19-20 th March 2008 in Manchester Joint DCC – RIN event Data centre managers, IR managers, funders & policy makers Aims & Objectives: – Improve data acquisition, management, analysis, validation, archiving and dissemination – Increase awareness of national & international data policies and standards – Facilitate co-operation between organisations and individuals – Exchange experience and best practice Next meeting in autumn tbc

21 Heard at the Forum…. protected by PDF Rembrandt in the attic Dont forget the researcher! stuff isn t getting done demand outstrips supply… careers developed more by luck than judgement Data managers as failed scientists need to sit down and write the manual teeth and sticks and carrots professionalising data management Data is not just about eScience/eResearch we need services not projects!

22 Heard at the Forum…. protected by PDF Rembrandt in the attic Dont forget the researcher! stuff isn t getting done demand outstrips supply… careers developed more by luck than judgement Data managers as failed scientists need to sit down and write the manual teeth and sticks and carrots professionalising data management Data is not just about eScience/eResearch we need services not projects! OR2008???

23 Federation Questions? Slides will be available at : http://wiki.ecrystals.chem.soton.ac.uk/index.php http://wiki.ecrystals.chem.soton.ac.uk/index.php http://www.ukoln.ac.uk/ukoln/staff/e.j.lyon/presentations.html This work is licensed under a Creative Commons Licence Attribution-ShareAlike 3.0 http://creativecommons.org/licenses/by-sa/3.0/


Download ppt "Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University."

Similar presentations


Ads by Google