Creating Citable Data Identifiers Ryan Scherle Mark Diggory.

Slides:



Advertisements
Similar presentations
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Advertisements

Building Support for a Discipline-Based Data Repository Ryan Scherle 1, Sarah Carrier 2, Jane Greenberg 2, Hilmar Lapp 1, Abbey Thompson 2, Todd Vision.
The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White.
Andy Powell, Eduserv Foundation July 2006 Repository Roadmap – technical issues.
Organising and Documenting Data Stuart Macdonald EDINA & Data Library DIY Research Data Management Training Kit for Librarians.
Effective management Accurate tracking Easier automation.
VO Sandpit, November 2009 Data Citation, Principles and Practice Sarah DataCite Annual Conference, 2014.
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith ViBRANT.
Funded by: © AHDS Sherpa DP – a Technical Architecture for a Disaggregated Preservation Service Mark Hedges Arts and Humanities Data Service King’s College.
Data citation from the perspective of a scholarly publisher Lyubomir Penev TDWG Data Citation Workshop, New Orleans, Oct 2011 ViBRANT.
Data Management I DBMS Relational Systems. Overview u Introduction u DBMS –components –types u Relational Model –characteristics –implementation u Physical.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Writing a Scientific Paper: Basics of Content and Organization
ÆKOS: A new paradigm for discovery and access to complex ecological data David Turner, Paul Chinnick, Andrew Graham, Matt Schneider, Craig Walker Logos.
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
State of Connecticut Core-CT Project Query 4 hrs Updated 1/21/2011.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Database Systems: Design, Implementation, and Management Ninth Edition
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Teaching Metadata and Networked Information Organization & Retrieval The UNT SLIS Experience William E. Moen School of Library and Information Sciences.
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
Metadata Standards and Applications 1. Introduction to Digital Libraries and Metadata.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Depth customization of DSpace: Best practices and techniques of institutional repository at IIT Kanpur, India By S. K. Vijaianand V. D. Shrivastava Gaurav.
Group-based Repositories in Oz Diane Costello Council of Australian University Librarians ICOLC Montreal 2007.
Supporting scientific communities by publishing data Dryad Digital Repository Peggy Schaeffer OpenAIRE/LIBER Workshop May 28, 2013 Ghent, Belgium.
JENN RILEY METADATA LIBRARIAN IU DIGITAL LIBRARY PROGRAM Introduction to Metadata.
What does it take to add data to my repository? Ryan Scherle Open Repositories 2015 PROMOTING SCHOLARSHIP THROUGH OPEN DATA.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Planning a digital library How to Build a Digital Library Ian H. Witten and David Bainbridge.
BMC Open Access Colloquium, 8 February Morgan: "Open Access Repositories"
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
Data archiving and curation Ryan Scherle Data Repository Architect Dryad Digital Repository CurateGear January 8, 2014 You may reuse any of the original.
Data Management in Scholarly Journals and possible Roles for Libraries – Some Insights from EDaWaX Sven Vlaeminck | Leibniz-Information Centre for Economics.
Data Publication and Quality Control Procedure for CMIP5 / IPCC-AR5 Data WDC Climate / DKRZ:
Roadmap Activity 2a: A GEOSS citation standard : Hans-Peter Plag IEEE University of Nevada, Reno, Nevada, USA;
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
1 Everyday Requirements for an Open Ontology Repository Denise Bedford Ontolog Community Panel Presentation April 3, 2008.
Now launched! Visit nature.com/scientificdata Honorary Academic Editor Susanna-Assunta Sansone Advisory.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS Instructor Ms. Arwa Binsaleh.
Alternative Architecture for Information in Digital Libraries Onno W. Purbo
Metadata Metadata Mark-up and Management © Adolf Knoll, National Library of the Czech Republic.
Eurostat SDMX and Global Standardisation Marco Pellegrino Eurostat, Statistical Office of the European Union Bangkok,
CSUN eCommons Submitting Learning Objects to CSUN eCommons: A Preliminary Guide February 7, 2008.
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
Internet Documentation and Integration of Metadata (IDIOM) Presented by Ahmet E. Topcu Advisor: Prof. Geoffrey C. Fox 1/14/2009.
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
Entering the Data Era; Digital Curation of Data-intensive Science…… and the role Publishers can play The STM view on publishing datasets Bloomsbury Conference.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group.
Be.wi-ol.de User-friendly ontology design Nikolai Dahlem Universität Oldenburg.
Joint Declaration of Data Citation Principles (Overview) The Data Citation Synthesis Group Joint Declaration.
XML 2002 Annotation Management in an XML CMS A Case Study.
NRF Open Access Statement
Moving on : Repository Services after the RAE
Implications of Using PCDM in the Face of a Major Repository Migration
Introduction to Design Patterns
Introduction to Metadata
Data Management: Documentation & Metadata
Introducing da|raSearchNet
File Systems and Databases
OpenML Workshop Eindhoven TU/e,
Health Ingenuity Exchange - HingX
Mission DataCite was founded in 2009 as an international organization which aims to: establish easier access to research data increase acceptance of research.
Metadata The metadata contains
Presentation transcript:

Creating Citable Data Identifiers Ryan Scherle Mark Diggory

 Mimosa house  807 South Virginia Dare Trail  Kill Devil Hills, NC USA  27948

  N, W

 S84-A41  WP0ZZZ99ZTS392124

Loxosceles reclusa

Citing identifiers  Mimosa house  807 South Virginia Dare Trail    Loxosceles reclusa  N, W  S84-A41  WP0ZZZ99ZTS392124

Identifiers matter  Some identifiers are machine-friendly, some are human-friendly  For citations, you need to strike a balance  Good identifiers are a critical selling point for an repository

Principles of citable identifiers

1. Use DOIs   Scientists are familiar with DOIs

1. Use DOIs   Scientists are familiar with DOIs  DOIs are supported by many tools and services

1. Use DOIs   Scientists are familiar with DOIs  DOIs are supported by many tools and services Current support: EprintsDspaceFedora No With work

2. Keep identifiers simple   Complex identifiers are fine for machines, but they’re bad for humans.  Despite best intentions, humans sometimes need to work with identifiers manually.

2. Keep identifiers simple   Complex identifiers are fine for machines, but they’re bad for humans.  Despite best intentions, humans sometimes need to work with identifiers manually. Current support: EprintsDspaceFedora Yes

3. Use syntax to illustrate relationships   Adding a tiny bit of semantics to an identifier is incredibly useful  Useful for various human “hacks”  Useful for statistics

3. Use syntax to illustrate relationships   Adding a tiny bit of semantics to an identifier is incredibly useful Current support: EprintsDspaceFedora No With work

4. When “meaning-bearing” content changes, create a versioned identifier  Scientists want data to be invariant to enable reuse by machines  Even a single bit makes a difference  Watch out for implicit abstractions…  What about DOI conventions?

5. When “meaningless” content changes, retain the current identifier  Descriptive metadata must be editable without creating a new identifier.  Humans rarely care about metadata changes, especially for citation purposes!  Caveat: machine-oriented systems may consider the “metadata” to be data, which requires identifier changes

Current versioning support EPrints Support for flexible versioning/relationships, but no support for expressing these relationships in identifiers. DSpace None. Fedora Implicit versioning of all data and metadata. This is highly useful, but it is too granular for citation purposes.

Principles of citable identifiers 1. Use DOIs 2. Keep identifiers simple 3. Use syntax to illustrate relationships 4. When “meaning-bearing” content changes, create a versioned identifier 5. When “meaningless” content changes, retain the current identifier

Hacking DSpace to support…  DOI identifier registration  Semantics in identifiers  Citation publication  Versioning

DSpace identifier services  Handle system independence  More future identifier systems will come.  Granular control  Separate reservation from registration  Citation  Registration of metadata with external services

DSpace identifier services

DataCite content service

Promoting accurate citations Added suggested citation formats up front

Versioning  Versioning is item “editioning”  Creation of new versions is a “user mediated” process (submitter or reviewer)  Versioning does not alter the original item  Version relationships are maintained independent of the item’s metadata

Submission-based revisions

Result: Citable data versions doi: /dryad.bb7m4

Future technical directions  Add metadata versioning under the hood -- may need to rethink some of the current system  Integrate our changes to core DSpace  Moving these features into the core requires further discussion with the Dspace user community

How are we doing? For 186 articles associated with Dryad deposits:  77% had “good” citations to the data  2% had “bad” citations to the data  21% had no data citations Standards for data citation are still evolving. Journals have yet to agree on where to place data citations, and authors are just starting to become familiar with the concept.

What should you do now?  Analyze how data is used and cited outside the repository  Determine whether use is more machine- oriented or more human-oriented  Design identifiers and identifier management to facilitate the observed uses

Thanks! Ryan Scherle Mark Diggory