CombeDay 2005 1 Making Data Openly Available Simon Coles.

Slides:



Advertisements
Similar presentations
AHM, Nottingham, September eBank UK : linking research data, scholarly communication and learning. Dr Liz Lyon, UKOLN, University of Bath Dr Simon.
Advertisements

S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
© S.J. Coles 2006 Usability WS, NeSC Jan 06 Enabling the reusability of scientific data: Experiences with designing an open access infrastructure for sharing.
Crystal Structure EPrints: Source Through the Open Archive Initiative S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
Crystallographic Metadata Simon Coles CrystalGrid Collaboratory Foundation Meeting September 2004.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. CLADDIER workshop.
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
© S.J. Coles 2006 eCrystals: A Route for Open Access to Small Molecule Crystal Structure Data Simon Coles School of Chemistry, University of Southampton,
E-Prints What are they?! How do they relate to CombeChem? Simon Coles CombeDay (08/01/2004)
A centre of expertise in digital information management UKOLN is supported by: Adding Value to Data and Information: Moving towards a Science.
UKOLN is supported by: Realising the scholarly knowledge cycle: The experience of eBank UK Dr Liz Lyon, UKOLN, University of Bath, UK CNI Task Force Meeting.
UKOLN is supported by: From research data to new knowledge: a lifecycle approach. Dr Liz Lyon, Director UKOLN, University of Bath, UK JISC/SURF/CNI Conference.
Digital | Curation | Centre Adding value to open access research data: reflections on the process of data curation Dr Liz Lyon, DCC Associate Director.
Towards an information model for I2S2
Integrating research data into the publication workflow: eBank UK experience Rachel Heery, UKOLN, University of Bath
Data Curation in Crystallography: Publisher Perspectives JISC Data Cluster Consultation Workshop CCLRC, Didcot, Oxon 10 October 2006.
UKOLN is supported by: e-Research: trends, requirements and challenges Dr Liz Lyon, UKOLN, University of Bath, UK Cross Research Council ICT Conference.
UKOLN is supported by: Digital Libraries and e-Research: a UK perspective on a changing landscape. Dr Liz Lyon, Director UKOLN, University of Bath, UK.
UKOLN is supported by: eBank UK : linking research data, scholarly communications and learning. Dr Liz Lyon, UKOLN, University of Bath, UK JISC CNI Conference.
UKOLN is supported by: Data, information and knowledge repositories: developing infrastructure to support the e-Research landscape. Dr Liz Lyon, Director.
JISC Joint Programmes Meeting eBank UK : linking research data, learning and scholarly communications. Dr Liz Lyon, UKOLN, University of Bath Dr.
UKOLN is supported by: Digital Library developments supporting eResearch Dr Liz Lyon, Director UKOLN, University of Bath, UK British Library, November.
A centre of expertise in digital information management UKOLN is supported by: Digital repositories as research infrastructure: a UK perspective.
UKOLN is supported by: Adding value to open access research data: the eBank UK Project. Dr Liz Lyon, Director UKOLN, University of Bath, UK OAI4, CERN.
A centre of expertise in digital information management UKOLN is supported by: British Academy e-Resources Policy Review: UKOLN Report.
UKOLN is supported by: Emergent technologies & digitisation: the institutional impact. Liz Lyon & Kevin Edge VCs Retreat, October a.
Federation The eCrystals Federation Dr Simon Coles, University of Southampton, UK Dr Liz Lyon, UKOLN, University of Bath, UK Open Repositories 2008, University.
Federation eCrystals Federation: Open Repositories for Data-driven Science Dr Liz Lyon, UKOLN, University of Bath, UK Dr Simon Coles, University of Southampton,
UKOLN is supported by: Enhancing access to research data: the challenge of crystallography Rachel Heery, Monica Duke, Michael Day UKOLN, University of.
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
UKOLN is supported by: Developing e-Infrastructure to support new research and learning paradigms. Dr Liz Lyon, Director UKOLN, University of Bath, UK.
EBank UK CCLRC Workshop February eBank and CCLRC Workshop February 2005 University of Bath.
Digital Repositories: interoperability & common services Closing Remarks Dr Liz Lyon, UKOLN, University of Bath, UK
The Discovery Landscape in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK – eBank UK project A centre.
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
The Central Role of Data ‘Capturing and Sharing Chemistry Research Data’ Simon Coles School of Chemistry, University of Southampton, U.K.
University of Southampton, U.K.
EPrints Workshop, January eBank UK: Dissemination of research data using EPrints Simon Coles, School of Chemistry, University of Southampton.
© S.J. Coles 2005 ACS 2005, San Diego Furthering Chemoinformatics through ‘Crystalloinformatics’ Simon J. Coles EPSRC National Crystallography Service.
© S.J. Coles 2006 Data Management in the Chemistry Domain Simon Coles School of Chemistry, University of Southampton, U.K.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
© S.J. Coles 2005 eChemInfo2005 Open Archives as a Route for Capture, Dissemination and Access to Chemical Data and Information Simon Coles School of Chemistry,
Metadata: An Overview Katie Dunn Technology & Metadata Librarian
Metadata for Large Science: The ICAT Data Model Brian Matthews, Leader, Scientific Applications Group, E-Science Centre, STFC Rutherford Appleton Laboratory.
Programs and Research In the flow: from discovery to disclosure Lorcan Dempsey CIC March
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
A centre of expertise in digital information management RDN, e-Prints UK and NOF- Digitise: a (very) small sample of UK OAI activity Andy.
Metadata for the Web Andy Powell UKOLN University of Bath
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
UKOLN is supported by: Enhancing access to research data: the e-Science project eBank UK A centre of expertise in digital information management.
Metadata for structural science Workshop on research metadata in context Nijmegen, 7–8 September 2010 Simon Lambert STFC e-Science UK.
UKOLN is supported by: Introduction to UKOLN Dr Liz Lyon, Director UKOLN, University of Bath, UK Grand Challenge Meeting, June a centre.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
David De Roure Workflows in Support of Large-Scale Science Provenance, a.
Metadata-based Discovery: Experience in Crystallography UKOLN is supported by: Monica Duke UKOLN, University of Bath, UK A centre of.
Filling institutional repositories: considering copyright issues Susan Veldsman eIFL Content Manager
UKOLN is supported by: Library futures in the new research landscape. Dr Liz Lyon, UKOLN, University of Bath, UK CURL Members Meeting October 2004, London.
Joint Information Systems Committee Repositories Support Project Summer School 2008 Amber Thomas, JISC.
eCrystals Federation: Open Repositories for global Open Science
Realising the scholarly knowledge cycle:
‘The eCrystals Federation’ Management and Publication of Small Molecule Structure Data for the Whole Crystallographic Community S.J. Colesa*, J.G. Freya,
JISC Joint Programmes Meeting 2005
Developing Institutional Data Repositories
eCrystals Federation: Open Repositories for global Open Science
Presentation transcript:

CombeDay Making Data Openly Available Simon Coles

CombeDay Data Overload!

CombeDay CombeChem: eScience testbed Properties X-Ray e-Lab Analysis Properties e-Lab Simulation Video Diffractometer Grid Middleware Structures Database

CombeDay Chemistry Publications Ideas and interpretationsHooks into the literature Results & derived data Raw data!

CombeDay

CombeDay Learning & Teaching workflows Research & e-Science workflows Aggregator services: eBank UK Repositories : institutional, e-prints, subject, data, learning objects Data curation: databases & databanks Institutional presentation services: portals, Learning Management Systems, u/g, p/g courses, modules Validation Harvesting metadata Data creation / capture / gathering: laboratory experiments, Grids, fieldwork, surveys, media Resource discovery, linking, embedding Deposit / self- archiving Peer-reviewed publications: journals, conference proceedings Publication Validation Data analysis, transformation, mining, modelling Resource discovery, linking, embedding Deposit / self- archiving Learning object creation, re-use Searching, harvesting, embedding Quality assurance bodies Validation Presentation services: subject, media-specific, data, commercial portals Resource discovery, linking, embedding Linking

CombeDay Establishing common ground… Understand the data creation process Terminology and definitions –Data –Metadata –Datafile –Dataset –Data holding Different views –Digital library researchers, computer scientists, chemists –Generic vs specific –Modeller vs practitioner Aim for a common ontology Modelling the domain Creating a metadata schema

CombeDay Crystallography workflow Initialisation: mount new sample on diffractometer & set up data collection Collection: collect data Processing: process and correct images Solution: solve structures Refinement: refine structure CIF: produce CIF (Crystallographic Information File format) Report: generate Crystal Structure Report RAW DATADERIVED DATARESULTS DATA

CombeDay Deposition into the archive

CombeDay An Archive entry ecrystals.chem.soton.ac.uk

CombeDay Access to the underlying data

CombeDay Some metadata issues Using simple and qualified Dublin Core Additional chemical information in schema for harvesting e.g. empirical formula Schema contains International Chemical Identifier (InChI) Specifies which ‘parts’ of a dataset are present Links to eprints (and other published literature) derived from the data Using vocabularies specific to crystallography Engaging the broader scientific community to ensure different schemas are compliant and standards can emerge

CombeDay ebank_dc record (XML) Crystal structure (data holding) Crystal structure report (HTML) Dataset Institutional repository eBank UK aggregator service ePrint UK aggregator service Subject service Deposit Harvesting OAI-PMH ebank_dc Harvesting OAI-PMH oai_dc Dataset dc:identifier dcterms:references Linking dc:type=“CrystalStructure” and/or “Collection” Model input Andy Powell, UKOLN. Eprint oai_dc record (XML) dcterms:isReferencedBy dc:type=“Eprint” and/or ”Text” Data flow in eBank Eprint “jump-off” page (HTML) dc:identifier Eprint manifestation (e.g. PDF) Linking

CombeDay Harvesting: OAIster

CombeDay Linking and aggregating

CombeDay Embedded in a science portal

CombeDay Current situation Version 2.0 eBank metadata schema Pilot institutional e-data repository for harvesting (raw, derived, results data) using EPrints software Exports records as ebank_dc and oai_dc Validation of schema & discussion with International Union of Crystallography for final developments and wider deployment Pilot eBank UK aggregator service Developing search interface Version 1.0 Testing with PSIgate physical sciences portal – embedding eBank UK

CombeDay What’s next? Progress towards generic metadata schemas Validation against other schema (CCLRC Model) Eprints.org software: allow for more generic scientific data and schemas? Metadata enhancement: keywords based on knowledge of keywords in related publications? Investigate identifiers: International Chemical Identifier Explore context sensitive linking Full embedding into chemical and crystallographic research and publishing e-Learning embedding and pedagogic evaluation Feasibility study in related domains