Data and Publication Discovery Brian Matthews, Information Management Group, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton,

Slides:



Advertisements
Similar presentations
Open repositories: value added services The Socionet example Sergey Parinov, CEMI RAS and euroCRIS.
Advertisements

Mirror Mirror on the wall does your repository reflect it all? Peter West and Timothy Miles-Board EPrints Services University of Southampton Southampton,
1 of 16 Information Access The External Information Providers © FAO 2005 IMARK Investing in Information for Development Information Access The External.
Search, access and impact: Web citation services Tim Brody Intelligence, Agents, Multimedia Group University of Southampton.
A shop window for your school research – maintaining your international research profile Physics and Astronomy within Southampton Research ___________________.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Institutional Repositories an opportunity for IAMSLIC Pauline Simpson Southampton Oceanography Centre, University of Southampton, UK
Preserv Preservation Eprint Services Scenario: Digital lifecycle begins with author creation and deposit of paper or data content into the institutional.
UKOLN is supported by: ePrints UK Workshops and Business Models Philip Hunter ePrints UK Project Manager A centre of expertise in.
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White.
1 Integrating user environments and data liquidity to improve the research experience.
Open Access Niamh Brennan Trinity College Dublin DRIVER Summit, Goettingen, January 17th 2008 Local Integration, National Federation TCD-RSS, TARA, IReL-Open,
Institutional Repositories and Self-Archiving Crisis? What Crisis? Bill Hubbard SHERPA Project Manager University of Nottingham.
A centre of expertise in digital information management IMS Digital Repositories Interoperability Andy Powell UKOLN,
1 ShareGeo Discovering and Sharing Geospatial Data
Where next…. Stakeholder workshop, 29 Jan To the end of the project.
CLADDIER Citation, Location, and* Deposit in Discipline and Institutional Repositories Bryan Lawrence (obviously et.al.) *Annotation CLADDIER workshop,
CLADDIER project fundamentals Citation, Location and Deposition in Discipline and Institutional Repositories Sam Pepler Project Manager BADC CLADDIER workshop,
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
Crystal Structure EPrints: Source Through the Open Archive Initiative S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge.
Publishing Data Catherine Jones Library Systems Development Manager, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK.
Opening the Research Data Lifecycle Workshop Capturing and Sharing Research Data Simon Coles School of Chemistry, University of Southampton, U.K.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
Linking Data and Publications: the Chemistry Way Simon Coles School of Chemistry, University of Southampton, U.K. CLADDIER workshop.
Guy McGarva, EDINA National Data Centre Rajendra Bose, DCC and School of Informatics University of Edinburgh Tuesday 15 May 2007 CLADDIER Project Workshop,
CURRENT ISSUES Current contents Over 3,000 items open access, 42% reports and working papers, 21% journal articles, 21% conference items, 7% book chapters,
S.J. Coles a*, J.G. Frey a, M.B. Hursthouse a, L. Carr b & C.J. Gutteridge b. a School of Chemistry, University of Southampton, UK.; b School of Electronics.
© S.J. Coles 2006 Digital Repositories as a Mechanism for the Capture, Management and Dissemination of Chemical Data Simon Coles School of Chemistry, University.
RCUK, Octiber Archiving research data and research publications. Dr Leslie Carr, Intelligence, Agents Multimedia, University of Southampton Dr Simon.
© S.J. Coles 2006 eCrystals: A Route for Open Access to Small Molecule Crystal Structure Data Simon Coles School of Chemistry, University of Southampton,
Towards an information model for I2S2
© S.J. Coles 2006 Institutional Data Repositories for Chemistry Simon Coles School of Chemistry, University of Southampton, U.K.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
EBank UK CCLRC Workshop February eBank and CCLRC Workshop February 2005 University of Bath.
A centre of expertise in data curation and preservation DigCCur2007 Symposium, Chapel Hill, N.C., April 18-20, 2007 Co-operation for digital preservation.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
Collaborative Open Access Projects: Collaborative promotion of research outputs Iryna Kuchma, eIFL Open Access program manager, eIFL.net Presented at Open.
ICAT + Information Model Brian Matthews Scientific Information Group E-Science Centre STFC Rutherford Appleton Laboratory
Sunday October 28, www.eprints.org Tim Brody - Stevan Harnad -
A centre of expertise in digital information management UKOLN is supported by: The Dublin Core Application Profile for Scholarly Works.
PaN-data WP7 - Integration Brian Matthews STFC-e-Science.
The Data Lifecycle and the Curation of Laboratory Experimental Data Tony Hey Corporate VP for Technical Computing Microsoft Corporation.
The Central Role of Data ‘Capturing and Sharing Chemistry Research Data’ Simon Coles School of Chemistry, University of Southampton, U.K.
Institutional repositories a bluffer’s guide. Academic libraries and archives  Cataloguing –Computerised catalogue databases (e.g. OPACS) –Networked.
© S.J. Coles 2006 Data Management in the Chemistry Domain Simon Coles School of Chemistry, University of Southampton, U.K.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
ePrints UK: a service provider project Ruth Martin UKOLN, University of Bath
Update on the VERSIONS Project for SHERPA-LEAP SHERPA Liaison Meeting UCL, 29 March 2006.
©euroCRIS/Keith G JefferyCRIS Seminar Brussels Discussion Topics Keith G Jeffery President, euroCRIS
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
Supporting further and higher education The UK FAIR Programme: OAI in context Chris Awre OAI3, CERN, February 2004.
VERSIONS Project Workshop London School of Economics and Political Science 10 May 2006.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Extending Access: Priorities and Solutions, November 2005 What are publishers doing to support research needs? Martin Richardson.
MEDIN Partners Meeting 2010 Submitting data to and using Data Archive Centres.
Cross-linking and Referencing Data and Publications in CLADDIER Brian Matthews, E-Science Centre, STFC Rutherford Appleton Laboratory.
ARROW Institutional Repositories for Managing e-Theses Presentation to ETD September 2005 Geoff Payne, ARROW Project Manager.
Publishing & Citing Research Data Arun Prakash. Agenda  Introduction  Why is Data publishing important ?  Ongoing Work  Role of Semantics.
Dataset citation Clickable link to Dataset in the archive Sarah Callaghan (NCAS-BADC) and the NERC Data Citation and Publication team
Open Archive Forum Rachel Heery UKOLN, University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
Open Access Tools for Scholars Scholarly Communication Retreat Wednesday December 12, 2007 Presented by Marcia Salmon.
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
OceanDocs Digital Repository of Marine Science Research Outputs
Repository Cross-Linking
Publishing software and data
Publications and Research Data – crosslinking repositories
Digitometric Services for Open Archives Environments
Developing Institutional Data Repositories
Data + Research Elements What Publishers Can Do (and Are Doing) to Facilitate Data Integration and Attribution David Parsons – Lawrence, KS, 13th February.
Presentation transcript:

Data and Publication Discovery Brian Matthews, Information Management Group, STFC Rutherford Appleton Laboratory CLADDIER workshop, Chilworth, Southampton, UK 15 th May 2007

Microsofts Science 2020 Report Modern scientific communication relies on both journals and databases. At present these are not integrated. By 2020 mutual linking will be commonplace and publications just containing peer-reviewed data will become available.

The Use Case Joanna, at the University of Southampton, has done some work on the biology of seawater off the coast of Cornwall. As part of her analysis she needs (from a number of locations): Publications and data describing prior or similar work. Oceanic profiles of salinity and temperature from the closest cruise in time and space, Meteorological data to accompany both her own sampling and the oceanic data, Remotely sensed ocean colour imagery (to add additional information on the biota). She will then publish a paper that cites the datasets, lodge the paper in her own institutional repository and also deposit her datasets in one or more appropriate data repositories (e.g. both the NOCS data archive, and the, BODC). The work Joanna has done is of interest in calibrating a global earth system model to compare simulations of oceanic CO 2 production with the scenarios used in the model. Fred, at Reading University needs to be able to find Joannas paper and data either via citations or directly from publication repositories. Having found the paper, the data should be obtainable via the citation and the data archive. As part of his work he checks back through the other datasets used and cited as inputs to Joannas data, as before he uses Joannas data, he suspects Joannas work could be recalibrated by using better quality meteorological re-analyses.

What does that need? 1.Joannas own data acquisition 2.Location and acquisition of prior publications and data 3.Location and acquisition of remote datasets required as part of the analysis 4.Creation of personal metadata for new data 5.Data analysis and paper writing 6.Citation of remote papers and datasets 7.Paper submission to a journal and acceptance 8.Repository submission of paper (maybe a preprint) 9.Repository submission of data 10.Further metadata creation for the data (at the data repository). 11.Further metadata creation for the publication (at the institutional repository) 12.Linking between institutional repositories and the data held at the discipline repository 13.All the datasets and publications cited need to be annotated with the citation information 1.Discovery of Joannas work by Fred (either from Joannas publication or datasets or citations thereof) 2.Acquisition of all the relevant publications and datasets by Fred 3.Analysis and Publication by Fred (and all the same steps from 5 as required by Joanna) 4.External Adjudicators need to be able to find and acquire citation information.

So what services do we need? In order to achieve this scenario we need to provide a set of key services Publishing of Data Browsing and searching –across different repositories –across data and publication Cross-citation of data and publication –forward and backward citation –need to maintain currency of citation links

Browsing and Searching Browsing and searching –across different repositories –across data and publication CLADDIER has provided a harvesting and search tool to support cross-repository searching Uses OAI-PMH – a conventional approach –Simple – but it works! –Simple key-word searching –Three participating repositories in the pilot –BADC, STFC ePubs, e-Prints Soton

Adding cross-citation The Discovery Service gives a broad-brush search Give you both publications and data sets –which are indexed on a key word A Google across repositories Currently, cannot tell whether the data and publication are actually related –what data and publications inspire a piece of work (generating a new data set) –what publications arise from a data set We need to exploit the concept of citation to see whether relationships are actually related

Traditional Citations

Cross-citation

Adding Citations to the Metadata Model Adding Citations has been considered in standard metadata models. e.g. Scholarly Works Application Profile –JISC funded initiative –Dublin Core Application Profile –Describing Scholarly Publications (ePrints) –Based on the FRBR model –Does consider Citations –But breaks citations up into small components –This is highly labour intensive to enter –Does not have a notion of back citation

FRBR Model

ePubs and Cross-Citations STFC ePubs has a metadata model based on FRBR Need to extend this to support cross-citation Keep it simple Can support forward and back links Have developed a simple model for citations

Citation Model

Maintaining Links Ideally the archives holding the datasets and publications would be notified that a paper citing them had been submitted. Metadata associated with those records would be updated to reflect the citations. The metadata in the publication repository should also link to the data in the data archives and vice versa. It would be great if this notification could be done automatically.

Notification Services To support this, we need to provide a notification service. Federated Repositories register with the service Repositories notify the service of citations The service informs (via broadcasting or targeting) repositories of citation, Service provides sufficient information to update metadata Still under development. Note Blogging software.

Conclusions The Use Case supports the scientific process with repositories This requires the cross-linking network of information objects Which needs to be stored, maintained and searched Tools and ideas relatively straightforward Lots of gluing of existing components Keep it simple – so it will get used