Presentation on theme: "Dryad Curation Practices August 2012. Dryad Package/File Structure DATA PACKAGE METADATA DATA PACKAGE METADATA BITSTREAM (DATA) PUBLICATION/ ARTICLE BITSTREAM."— Presentation transcript:
Dryad Curation Practices August 2012
Dryad Package/File Structure DATA PACKAGE METADATA DATA PACKAGE METADATA BITSTREAM (DATA) PUBLICATION/ ARTICLE BITSTREAM (README) BITSTREAM (DATA) DATA FILE METADATA DATA FILE METADATA DATA FILE METADATA DATA FILE METADATA Scholarly publication/article associated with Dryad data package, not stored in Dryad A Dryad data package is a conceptual and metadata object. It contains a summary description of all the constituent data files and creates the link with the associated publication. Each data file has a metadata description and at least one bitstream (additional bitstreams, such as readme files, are optional). Metadata pertaining to the publication (citation, publication date, article DOI) is stored in the data package. Metadata pertaining to each file and its embargo period is stored in each file record.
Important Curation Documents Curation manuals – – Notifications of new submissions, newly published articles, other assignments – Correspond with authors using this account, send as Curator office whiteboard – Google doc shared with dryadassistant google account – Includes login information for Dryad user accounts, EZID, etc. Submission tracking spreadsheet – Google doc shared with dryadassistant google account Integrated journal metadata s – Access via Templates for correspondence –
Integrated and Non-Integrated Journals Non-Integrated – No coordination between journal and Dryad (no metadata s, journal contact addresses for reporting, etc.) Integrated – Metadata s send info ahead of submission – May use review workflow or only archive data after manuscript acceptance – May require ‘blackout’ of Dryad submission until article publication – Journal contacts are notified upon submission for review (if using review workflow), acceptance to blackout queue, approval/archiving, and weekly summary
The original integrated workflow is represented to the right. Some journals now also use a review workflow with additional steps or require the Dryad data package to remain hidden until after article publication (what we call “blackout”). Further integration details are available in the following presentation: yad/images/c/c6/DryadIntegrati onOverview.pdf
Basic Integrated Workflow (no review) Author submits manuscript to journal Journal reports accepted manuscript to Dryad; Dryad creates provisional record Journal invites author to submit data to Dryad & provides link to provisional record Author submits data to Dryad & receives DOI Dryad curator approves submission & sends DOI to author & journal Dryad publishes data files with link to article; Journal adds Dryad DOI to all forms of article
Review Workflow Journal sends manuscript information to Dryad before manuscript acceptance and invites authors to upload data. Dryad submission is routed to private review workspace, not main curation and publication queue. Passkey link is sent to journal for editor/reviewer access to Dryad submission. Author may continue to add files while submission is in review workspace. Journal sends second metadata to Dryad with manuscript acceptance notification, triggering any associated submission to move from review to curation. Curator inspects and approves, queues, or rejects submission, as in basic workflow.
Review Workflow Author submits manuscript to journal Journal reports manuscript under review to Dryad; Dryad creates provisional record Journal invites author to submit data to Dryad & provides link to provisional record Author submits data to Dryad, using link sent by journal to provisional record Dryad sends review passcode and DOI to author & journal Upon article acceptance, journal notifies Dryad Dryad publishes data files with link to article; Journal adds Dryad DOI to all forms of article
Navigation Notifications of new tasks go to Log in to Dryad site with and as appropriate – Dryad site left sidebar: – My Submissions are submissions you have created – My Tasks are submissions you can act on as a curator – Workflow Overview provides a way to search for items before or after archiving, and to force changes in their status that aren’t always available in the interface
Overview of New Submission Processing JOURNAL SETTINGS ARTICLE STATUS FILES SETTINGS REJECT APPROVE REJECT APPROVE QUEUE ERROR INTEGRATEDNON-INTEGRATED PUBLISHED NOT PUBLISHED REVIEW? BLACKOUT? METADATA ? SPREADSHEET ENTRY? REVIEWPUBLISHED ACCEPTED APPROPRIATE NOT APPROPRIATE NOT APPROPRIATE NOT APPROPRIATE NOT APPROPRIATE BLACKOUTNO BLACKOUT
Journal Settings See JournalSubmissionTracking spreadsheet shared in google docs. First tab (“Notes”) lists each integrated journal and its review and blackout settings. Also search for duplicate submissions or notes in appropriate sheet. Article Status If integrated submission, should be indicated in metadata (not stated = accepted manuscript) Look for article DOI or volume information in the submitted metadata. Google search and/or visit publisher website.
1.New submissions will be listed on the My Tasks page in the list labeled In Curation: Unclaimed 2.Choosing account with which to claim submission A.If item is going to be approved/rejected (publication blackout is not required = integrated journal that does not require blackout OR any journal if the article has been published) Claim with Dryad Assistant account B.If item is going to publication blackout queue (integrated journals requiring publication blackout OR non- integrated journals ahead of article publication) Claim with Dryad Queue account 3.Once claimed, submission will appear in In Curation: Claimed list on My Tasks page 4.Click Edit item(s) button (lower right when viewing the claimed submission) and open package and all files in tabs to inspect files and edit all metadata Claiming Submissions
Inspecting Files Check for technical problems, corrupt files, files that won’t open in expected software, etc. Files should contain something that looks like data, with a very broad definition of data (supplementary figures, multimedia, etc., are ok, the manuscript itself is not). Look for copyright statements and licenses (not good). Look for identifiable human subject data (err on the side of caution). Look for duplicated files, data files uploaded in place of readme files, etc., and clean these up.
Rejecting Submissions The most common reasons for rejection are inappropriate files, submissions associated with integrated journals for which we have no metadata , and integrated submissions that should have been directed to the review workspace but the author did not use the integrated process. A submission might also be rejected because a journal is out of scope, but always consult a senior curator before rejecting for this reason. When rejecting a submission, you must enter a reason. This reason will be sent to the submitter; it should be courteous and should explain clearly to them what the problem was and how they can fix it if they wish to resubmit. See Templates for Correspondence wiki page for common rejection explanations.
Editing Metadata REMEMBER: some metadata, such as author names, is repeated on the package and files and will need to be edited in both places. Scan over all metadata to see if it looks reasonable and identify problems. Strip any formatting tags or mangled characters. International or special characters can often be copied and pasted from metadata or other source on the web. Check the journal name, especially for non-integrated journals. It should match exactly the name already in use in the repository. If it’s a new journal, ask a senior curator about establishing a new name. Author names should be formatted as – LastName, FirstName M. I. – remove any titles, such as “PhD” Data package title should be formatted as – Data from: Article title in sentence case Add specialized keywords (geographic, temporal, scientific name), moving them from/to general subject keywords, as appropriate. Scientific names should be Latin (common names go in dc:subject instead) and should be recognized by Look for line breaks, especially in article abstract and file descriptions and edit these fields as needed for clarity when the content is displayed without line breaks. Check for inappropriate embargoes (custom when we have no info from journal, untilArticleAppears when article is out) and adjust as needed. If custom embargo, add embargo period (from journal) as dryad:curatorNote in file metadata.
Approving/Archiving (no blackout) 1.Check for duplicates and notes in tracking spreadsheet, if you haven’t already done so. 2.Click Approve (will need to click twice if item is going through blackout first, based on settings). notification is sent automatically. 3.Visit Dryad homepage and find item in Recently Published list (if not there, look for it on My Tasks page or track down any error). 4.Check for duplicated package DOI and delete, if needed. If there is a duplicated package DOI the link from the homepage to the package won’t work, and you’ll need to manually modify the link to reach the package page. 5.Check that package DOI resolves correctly (may be a few minutes delay). Log in to EZID and check/fix, if needed. 6.Update submission tracking spreadsheet.
Placing Submission in Publication Blackout Queue You should have already claimed the item with Dryad Queue account, inspected files, edited metadata, and checked for duplicate submissions at this point. 1.Register package DOI in EZID. Go to Create IDs -> Advanced in order to specify your DOI. Use as the location and leave all other description blank. 2.Send acceptance to submitter (and journal contacts, if integrated journal). Template is saved as a draft in dryadassistant gmail account. Journal contact s should appear in package metadata and Dryad journal config file. 3.Add entry to submission tracking spreadsheet. 4.Leave task claimed in Dryad Queue account.
Updating Archived Items Once Article is Published You have made a match between a published article and a Dryad data package that needs to be updated. 1.Check author names, article title, and article abstract against published article and update if needed. 2.Edit package dc:date.issued to match earliest (online) publication date of article (format as YYYY-MM-DD) 3.Add article DOI to package dc:relation.isreferencedby (format as doi:####) 4.Add article citation to package dc:identifier.citation or updated existing citation (if advance access online article now print citation). Format as: LastName F, LastName FM (YYYY) Article title in sentence case. Journal Name Vol(Num): page-page. or LastName F, LastName FM (YYYY) Article title in sentence case. Journal Name, online in advance of print. 5.Lift embargoes or set embargo end dates for each file, as appropriate. Go to the Item Embargo pane in Edit Item to work with embargoes. 6.Visit public view of package page (leave Edit Item) and verify article citation, resolvable article DOI, and updated embargoes. 7.Update submission tracking spreadsheet.
Approving Submission out of Publication Blackout 1.Update metadata as described in Updating Archived Items Once Article is Published. Because this submission isn’t archived yet, dates won’t have been added to the metadata by the system, so you will add the article publication date to package as dc:date.issued, instead of editing an exisiting value. 2.Click Approve. Find item in Publication Blackout list on My Tasks page. Claim the task and click Approve again. 3.Visit Dryad homepage and find item in Recently Published list (if not there, look for it on My Tasks page or track down any error). 4.Check for duplicated package DOI and delete, if needed. If there is a duplicated package DOI the link from the homepage to the package won’t work, and you’ll need to manually modify the link to reach the package page. 5.Update package DOI in EZID (use Lookup function) and change location to point to public item URL. 6.Update file embargoes (lift or set end date, appropriate). Go to the Item Embargo pane in Edit Item for each file to work with embargoes. 7.Update submission tracking spreadsheet.
PACKAGE METADATA GUIDE Authorsdc:contributor.authorrepeatablerequiredLastName, FirstName M. Corresponding authordc:contributor.correspo ndingAuthor not repeatablerequiredLastName, FirstName M. Spatial coveragedc:coverage.spatialrepeatableoptionalplace names, geographic coordinates, etc Temporal coveragedc:coverage.temporalrepeatableoptionalintended for geologic timespans, but years and other values are accepted Approval timestampdc:date.accessionednot repeatablerequiredsystem-generated upon submission approval Approval timestampdc:date.availablenot repeatablerequiredsystem-generated upon submission approval Article publication datedc:date.issuednot repeatablerequiredsystem-generated to match approval date, later edited by curator to article publication Data package DOIdc:identifiernot repeatablerequireddoi: /dryad.#### Article citationdc:identifier.citationnot repeatableoptionalmodified PLoS citation style Journal’s manuscript ID dc:identifier.manuscript Number not repeatableoptionalonly for integrated submissions Data package handledc:identifier.urinot repeatablerequiredhttp://hdl.handle.net/10255/dryad.####, system-generated upon submission approval Abstractdc:descriptionnot repeatableoptionalarticle abstract Component data file DOIs dc:relation.haspartrepeatablerequireddoi: /dryad.####/1, doi: /dryad.####/2, etc Article volume, issue, year dc:relation.ispartofserie s not repeatableoptionalonly present if entered by depositor during submission Article DOIdc:relation.isreference dby not repeatableoptionaldoi:#### Keywordsdc:subjectrepeatableoptional Data package titledc:titlenot repeatablerequiredData from: Article title Record typedc:typenot repeatablerequiredsystem-generated, now set to “Article” Curator notedryad.curatorNoterepeatableoptionalrarely used Scientific namesdwc:ScientificNamerepeatableoptionalLatin taxon names Journal nameprism:publicationNamenot repeatablerequireduse authorized form of name only
FILE METADATA GUIDE Authorsdc:contributor.authorrepeatablerequiredLastName, FirstName M. Spatial coveragedc:coverage.spatialrepeatableoptionalplace names, geographic coordinates, etc Temporal coveragedc:coverage.temporalrepeatableoptionalintended for geologic timespans, but years and other values are accepted Approval timestampdc:date.accessionednot repeatablerequiredsystem-generated upon submission approval Bitstream availability timestamp dc:date.availablenot repeatablerequiredsystem-generated upon availability of bitstreams for download (will not appear if file is embargoed) Approval datedc:date.issuednot repeatablerequiredsystem-generated upon submission approval Data file DOIdc:identifiernot repeatablerequireddoi: /dryad.####/# Data file handledc:identifier.urinot repeatablerequiredhttp://hdl.handle.net/10255/dryad.####, system-generated upon submission approval File descriptiondc:descriptionnot repeatableoptionalbrief file description entered by depositor Associated data package DOI dc:relation.ispartofnot repeatablerequireddoi: /dryad.#### Rights informationdc:rights.urinot repeatablerequiredCC0 URI for all items except a few legacy items under Original License Keywordsdc:subjectrepeatableoptional Data file titledc:titlenot repeatablerequired Record typedc:typenot repeatablerequiredsystem-generated, now set to “Dataset” Curator notedryad.curatorNoterepeatableoptionalrarely used, mostly to specify custom embargo dates Scientific namesdwc:ScientificNamerepeatableoptionalLatin taxon names Embargo end datedc:date.embargoedUntilnot repeatableoptionalYYYY-MM-DD, will have value for embargoed items when the article has not yet been published then edited by curator to real date, not present for items that were never embargoed or after embargo has been lifted (see dc:date.available for embargo lifting timestamp) Embargo typedc:type.embargonot repeatablerequiredcontrolled list of values: none, untilArticleAppears, oneyear, custom