Download presentation
Presentation is loading. Please wait.
Published byJoe Asher Modified over 9 years ago
1
Creating Citable Data Identifiers Ryan Scherle Mark Diggory
2
Mimosa house 807 South Virginia Dare Trail Kill Devil Hills, NC USA 27948
3
1903-12-17 36.019705 N, 75.668769 W
4
79330-S84-A41 WP0ZZZ99ZTS392124
5
Loxosceles reclusa
6
Citing identifiers Mimosa house 807 South Virginia Dare Trail 1903-12-17 27948 Loxosceles reclusa 36.019705 N, 75.668769 W 79330-S84-A41 WP0ZZZ99ZTS392124
7
Identifiers matter Some identifiers are machine-friendly, some are human-friendly For citations, you need to strike a balance Good identifiers are a critical selling point for an repository
9
http://purl.dlib.indiana.edu/iudl/lilly/slocum/LL-SLO-009276
12
Principles of citable identifiers
13
1. Use DOIs http://dx.doi.org/10.5061/dryad.123ab Scientists are familiar with DOIs
14
1. Use DOIs http://dx.doi.org/10.5061/dryad.123ab Scientists are familiar with DOIs DOIs are supported by many tools and services
15
1. Use DOIs http://dx.doi.org/10.5061/dryad.123ab Scientists are familiar with DOIs DOIs are supported by many tools and services Current support: EprintsDspaceFedora No With work
16
2. Keep identifiers simple http://dx.doi.org/10.5061/dryad.123ab Complex identifiers are fine for machines, but they’re bad for humans. Despite best intentions, humans sometimes need to work with identifiers manually. http://dx.doi.org/10.1179/1743131X11Y.0000000009 http://dx.doi.org/10.1016/B978-0-12-220851-5.00003-4
17
2. Keep identifiers simple http://dx.doi.org/10.5061/dryad.123ab Complex identifiers are fine for machines, but they’re bad for humans. Despite best intentions, humans sometimes need to work with identifiers manually. Current support: EprintsDspaceFedora Yes
18
3. Use syntax to illustrate relationships http://dx.doi.org/10.5061/dryad.123ab/3 Adding a tiny bit of semantics to an identifier is incredibly useful http://files.eprints.org/691/ http://files.eprints.org/447/ http://files.eprints.org/556/ Useful for various human “hacks” Useful for statistics
19
3. Use syntax to illustrate relationships http://dx.doi.org/10.5061/dryad.123ab/3 Adding a tiny bit of semantics to an identifier is incredibly useful Current support: EprintsDspaceFedora No With work
20
4. When “meaning-bearing” content changes, create a versioned identifier Scientists want data to be invariant to enable reuse by machines Even a single bit makes a difference Watch out for implicit abstractions… http://dx.doi.org/10.5061/dryad.123ab/thumbnail What about DOI conventions?
21
5. When “meaningless” content changes, retain the current identifier Descriptive metadata must be editable without creating a new identifier. Humans rarely care about metadata changes, especially for citation purposes! Caveat: machine-oriented systems may consider the “metadata” to be data, which requires identifier changes
22
Current versioning support EPrints Support for flexible versioning/relationships, but no support for expressing these relationships in identifiers. DSpace None. Fedora Implicit versioning of all data and metadata. This is highly useful, but it is too granular for citation purposes.
23
Principles of citable identifiers 1. Use DOIs 2. Keep identifiers simple 3. Use syntax to illustrate relationships 4. When “meaning-bearing” content changes, create a versioned identifier 5. When “meaningless” content changes, retain the current identifier
24
Hacking DSpace to support… DOI identifier registration Semantics in identifiers Citation publication Versioning
25
DSpace identifier services Handle system independence More future identifier systems will come. Granular control Separate reservation from registration Citation Registration of metadata with external services
26
DSpace identifier services
27
DataCite content service
28
Promoting accurate citations Added suggested citation formats up front
29
Versioning Versioning is item “editioning” Creation of new versions is a “user mediated” process (submitter or reviewer) Versioning does not alter the original item Version relationships are maintained independent of the item’s metadata
30
Submission-based revisions
33
Result: Citable data versions doi:10.5061/dryad.bb7m4
34
Future technical directions Add metadata versioning under the hood -- may need to rethink some of the current system Integrate our changes to core DSpace Moving these features into the core requires further discussion with the Dspace user community
35
How are we doing? For 186 articles associated with Dryad deposits: 77% had “good” citations to the data 2% had “bad” citations to the data 21% had no data citations Standards for data citation are still evolving. Journals have yet to agree on where to place data citations, and authors are just starting to become familiar with the concept.
38
What should you do now? Analyze how data is used and cited outside the repository Determine whether use is more machine- oriented or more human-oriented Design identifiers and identifier management to facilitate the observed uses
39
Thanks! Ryan Scherle ryan@scherle.org Mark Diggory mdiggory@atmire.com
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.