ORNL DAAC Experience With Digital Object Identifiers (DOIs) Bruce Wilson, ORNL DAAC Manager for NASA Data Center Managers telecon 22 Feb 2010.

Slides:



Advertisements
Similar presentations
The Corporation for National Research Initiatives The Handle System Persistent, Secure, Reliable Identifier Resolution.
Advertisements

Identifiers and trust: lessons for data publishers Valued Resources: Roles and Responsibilities of Digital Curators and Publishers FOURTH BLOOMSBURY.
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Giri Palanisamy Oak Ridge National Laboratory & Lorrie Apple Johnson U.S. Department of Energy October 16, 2013.
Peter Griffith and Megan McGroddy 4 th NACP All Investigators Meeting February 3, 2013 Expectations and Opportunities for NACP Investigators to Share and.
Digital Object Identifiers for EOSDIS data HDF Workshop April 17, 2012 John Moses, ESDIS
Digital Object Identifiers for EOSDIS data ESDSWG TIWG November 2, 2011 John Moses, ESDIS
CLIMATE SCIENTISTS’ BIG CHALLENGE: REPRODUCIBILITY USING BIG DATA Kyo Lee, Chris Mattmann, and RCMES team Jet Propulsion Laboratory (JPL), Caltech.
1 CS 502: Computing Methods for Digital Libraries Lecture 4 Identifiers and Reference Links.
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
Tobias Weigel (DKRZ) Tobias Weigel Deutsches Klimarechenzentrum (DKRZ) Persistent Identifiers Solving a number of problems through a simplistic mechanism.
SAFARI 2000 Data Activities at the ORNL DAAC Bob Cook, Les Hook, Stan Attenberger, Dick Olson, and Tim Rhyne Oak Ridge National Laboratory.
Inter-American Workshop on Environmental Data Access Panel discussion on scientific and technical issues Merilyn Gentry, LBA-ECO Data Coordinator NASA.
Digital Object Identifiers for EOSDIS data ESIP Winter Meeting Jan 6, 2011 John Moses, ESDIS
GLOBAL BIODIVERSITY INFORMATION FACILITY Dr Vishwas Chavan Senior Programme Officer for DIGIT Data Citation Mechanism and.
Providing Access to Your Data: Tracking Data Usage Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
UC3 Standards and Best Practices for Datasets and Other Supplemental Journal Article Materials UC3 Stephen Abrams Patricia Cruse John Kunze.
Dataset Citation: From Pilot to Production Mark Martin Assistant Director, Office of Scientific and Technical Information U.S. Department of Energy.
EZID Easy Identifiers UC Curation Center California Digital Library.
MODIS Land and HDF-EOS HDF-EOS Workshop Presentation September 20, 2000 Robert Wolfe NASA GSFC Code 922, Raytheon ITSS MODIS Land Science Team Support.
Citing Data Sets in the Literature: ORNL DAAC Practices Robert Cook, Suresh SanthanaVannan, and Daine Wright Environmental Sciences Division Oak Ridge.
ATMOSPHERIC SCIENCE DATA CENTER ‘Best’ Practices for Aggregating Subset Results from Archived Datasets Walter E. Baskin 1, Jennifer Perez 2 (1) Science.
CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey 1.
1Managed by UT-Battelle for the Department of Energy ORNL DAAC Overview for ESIP Carbon Cluster Environmental Data Science & Systems Objective:
1 ORNL DAAC Data Products and Tools Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN NSIDC User Working Group Meeting.
Enhancing Linkages Between Projects and Datasets: Examples from LBA-ECO for NACP Lisa Wilcox, Amy L. Morrell,
Where are the rewards? University of Melbourne 28 January
Data Citation and Data Attribution A View from the Data Center Perspective Bruce E. Wilson Group Lead, Client & Collaboration Technologies Oak Ridge National.
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
DOI & Crossref Arnoud de Kemp Springer-Verlag
Global map layers Additional global data sets such as Hydrology data (Hydrosheds), new and updated Landcover data (Globcover), demographic data and others.
Where are the rewards? Building a culture of data citation workshop Edith Cowan University, Perth March
Joint Declaration of Data Citation Principles Notes [1] CODATA 2013: sec 3.2.1; Uhlir (ed.) 2012, ch 14; Altman &
MODIS Land Product Subsets Suresh K. Santhana Vannan, Robert B. Cook, Bruce E. Wilson, Lisa M. Olsen HDF and HDF-EOS Workshop XII October 15 – October.
Agency Requirements: NSF Data Management Plans Ruth Duerr National Snow and Ice Data Center Version 1.0 October 2012 Section: The Case for Data Stewardship.
Managing Your Data: Assign Descriptive File Names Robert Cook Oak Ridge National Laboratory Section: Local Data Management Version 1.0 October 2012.
DataONE: Preserving Data and Enabling Data-Intensive Biological and Environmental Research Bob Cook Environmental Sciences Division Oak Ridge National.
1. 2 Rewards are real … but few (yet) 3 The citation benefit intensified over time... ...with publications from 2004 and 2005 cited 30 per cent more.
Cyberinfrastructure to promote Model - Data Integration Robert Cook, Yaxing Wei, and Suresh S. Vannan Oak Ridge National Laboratory Presented at the Model-Data.
NOAA Data Citation Procedural Directive 8 November 2012 DAARWG.
ORNL DAAC: Introduction Bob Cook ORNL DAAC Environmental Sciences Division Oak Ridge National Laboratory.
Globally Unique Identifiers in Biodiversity Informatics Kevin Richards Landcare Research NZ TDWG 2008.
1 Not So Strange Bedfellows: Information Standards For Librarians AND Publishers November 6, 2015.
Terra MODIS Collection 4 / 4.5 and Aqua MODIS Collection 4; Sinusoidal Projection Data from 2000 to present; 8-day, 16-day, or annual composites Sites.
1 SAFARI 2000 Data: Next Steps SAFARI 2000 Data Group Jaime Nickeson, Dave Landis, Jeff Morisette, and Jeff Privette, Goddard Space Flight Center, NASA.
Building a Framework to Support Scholarly Journal Publishing at the University of Pittsburgh Vanessa Gabler Electronic Publications Associate, Office of.
Creating Documentation and Metadata: Creating a Citation for Your Data Robert Cook Oak Ridge National Laboratory Section: Local Data Management Copyright.
CNR – National Research Council, Rome (IT) Central Library ‘G. Marconi’ National Centre for Grey Literature and National ISSN Centre CNR – National Centre.
Future Functionality and CrossRef Policy Special Member Meeting December 4th, 2001.
Data Systems Integration Committee of the Earth Science Data System Working Group (ESDSWG) on Data Quality Robert R. Downs 1 Yaxing Wei 2, and David F.
Providing access to your data: Determining your audience Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
Globally Unique Identifiers: What, why, when, which and what now? Dave Thau University of Kansas
Data Management Practices for Early Career Scientists: Closing Robert Cook Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN.
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
ORNL DAAC MODIS Land Product Subsets 1 Suresh K. Santhana Vannan, Robert B. Cook, Bruce E. Wilson, Lisa M. Olsen Environmental Sciences Division, Oak Ridge.
Working with Your Archive : Broadening Your User Community Robert R. Downs, PhD NASA Socioeconomic Data and Applications Center (SEDAC) Center for International.
1 Digital Object Identifiers Update ESIP Data Stewardship Committee Meeting May 16, 2016 Presenters: Nate James, ESDIS Lalit Wanchoo, ADNET Systems Inc.
Networked Information Resources Federated search, link server, e-books.
Updating image To update the background image: Go to ‘View’ Select ‘Slide Master’ Select the page with the image Right click on the image and select ‘Change.
NASA Tools for Remote-Sensing in Ecology Research Workshop 2: NASA Tools for Remote-Sensing in Ecology Research 95 th Annual ESA Meeting, Workshop 2, July.
Data Citation and You: The new AGU guidelines for data citation
PIDs and National PID Services
Standardization Promotes Biogeochemical Data Management and Use in Multidisciplinary Environmental Research Yaxing Wei, Suresh Vannan, Robert B. Cook,
ORNL DAAC experience with Persistent Identifiers
Persistent Identifiers Implementation in EOSDIS
ACS 2016 Moving research forward with persistent identifiers
AGU Paper Number: IN43B-1697 Evolving a NASA Digital Object Identifiers System with Community Engagement Lalit Wanchoo1 and Nathan.
Linking persistent identifiers at the British Library
CNI Spring 2010 Membership Meeting
Training Course on Data Management for Information Professionals and In-Depth Digitization Practicum September 2011, Oostende, Belgium Citation.
Presentation transcript:

ORNL DAAC Experience With Digital Object Identifiers (DOIs) Bruce Wilson, ORNL DAAC Manager for NASA Data Center Managers telecon 22 Feb 2010

Acknowledgements and Sources  Bob Cook, ORNL DAAC Scientist  DataONE Core CI team, particularly Matt Jones (UCSB) and Dave Vieglais (U Kansas)  ESIP Product & Stewardship, particularly Ruth Duerr (NSIDC) and Bob Downs (SEDAC)  Note: ORNL’s CDIAC has started assigning DOI’s for all of their finalized data sets. 22 February 20102

ORNL DAAC Citation Policy 22 February    Citation is in the name of the investigators  Example (with DOI):  Turner, D.P., W.D.Ritts, and M. Gregory BigFoot NPP Surfaces for North and South American Sites, Data set. Available from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. [ doi: /ORNLDAAC/750

What Problem Are We Addressing? 22 February  ORNL DAAC has used data citations for many years  Track use of data in literature (impact)  Provide credit to investigators  Create incentives for publishing and sharing data  Some journal editors rejected URL citations  Regarded as transient (very valid concern)  Some scientists didn’t see data as “publication”  We want data sets listed on CV’s  Strong way to measure impact of data set for tenure

What Is a DOI? 22 February  Technically, it’s a particular Handle implementation  Limited number of registrars  Each publisher gets a prefix (e.g )  Publisher assigns an identifier after the prefix  Publisher registers the DOI with a URL and metadata  Endpoint URL can be updated as systems evolve  Registration can include back-links (documents cited)  Enables citation chain  Can help establish dependence of data sets (future use)  DOI resolves at use time to current endpoint URL   doi: /ORNLDAAC/945

ORNL Experience 22 February  Working with CrossRef as a registrar  $500/year membership fee, ~$250 to register 900 DOIs  Our DOIs resolve to a web page about the dataset  Very positive reaction from investigators  Makes usage metrics somewhat easier  Haven’t implemented backlinking yet, but should  It’s a social contract that we don’t change the data  Updated dataset ==> new DOI (if “significant”)  Minor updates (spelling corrections, clarifications) OK  Adding a new data format file is harder to decide

Different types of update operations 22 February  Correct reference or spelling in documentation  No change in DOI, but still should show provenance  Augment documentation for clarity  No change in DOI, but still should show provenance  Add copy of data in new format  Probably no change in DOI, but still should show provenance  Correct error in data  New DOI; show provenance  Append new data  New DOI; show provenance

DOIs work well for some things 22 February  Finalized datasets (ones that don’t change)  Datasets that change occasionally  Global Fire emission dataset updated annually  Documents (best practices, product documentation)  Could work for Remote Sensing at the product level

DOI’s less appropriate for other things 22 February  Cost (primarily) prohibits assigning for granules  Unique ID’s needed, but may be data center-internal  DOIs are a publishing standard, adapting for data ID  Dynamically generated and stream data  One DOI per MODIS product probably makes sense  Desirable to be able to reproduce data, but hard  MODIS subsetter (particularly considering reprocessed granules)  Would have to have a separate identifier for each request  Other processing tools, like OGC web services  Possibly use data citations with workflow provenance  Partition data citation from data reproducibility

Good citations help assess dependence 22 February  Synthesis is increasingly important science  Are all of the data used in the study independent?  Example: Luyssaert et al Net Primary Productivity (NPP)  Data at ORNL DAAC (doi: /ORNLDAAC/949)  Article at doi: /j x  Drawn from many sources (very well documented)  ftp://ftp.daac.ornl.gov/data/global_vegetation/forest_carbon_flux/comp /appendix_a_database_sources.pdf  Future work using Luyssaert dataset can’t compare it to any of the underlying data  Also an issue in cal-val for remote sensing  What data was used for this RS product?

Data Identifiers are evolving 22 February  DataCite.org (German Library + others, including CDL)  Particularly focused on research data  Life Science Identifiers (LSID)  Heavily used in oceans community  Some concerns about URN versus URI  See  Globally Unique Identifiers (GUIDs)  Need some type of resolution mechanism  Big challenge to support something “forever”

Impact Metrics 22 February  “Cited” means formal citation in reference list  “Referred” means the data was acknowledged somewhere in the body of the paper