Presentation on theme: "The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White."— Presentation transcript:
The Dryad Data Repository Ryan Scherle 1, Hilmar Lapp 1, Amol Bapat 2, Sarah Carrier 2, Jane Greenberg 2, Peggy Schaeffer 1, Todd Vision 1,3, Hollie White 2 1 National Evolutionary Synthesis Center (NESCent), USA 2 School of Information and Library Science, University of North Carolina, USA 3 Department of Biology, University of North Carolina, USA Summary Dryad is a repository of data underlying scientific publications, with an initial focus on evolution, ecology, and related fields. Dryad allows investigators to validate published findings, explore new analysis methodologies, repurpose data for research questions unanticipated by the original authors, and perform synthetic studies such as formal meta-analyses. When an author publishes an article, some types of supporting data are deposited in well-known archives like GenBank and TreeBASE, but other types of data have no permanent home. Dryad provides that home. Dryad has a modular architecture. The core module is based on the open-source DSpace repository platform, which is used by hundreds of universities and other institutions for storage of scholarly publications. Joint Data Archiving Policy Partner journals have agreed to jointly enact a data archiving policy. This policy will ensure that all data associated with papers in participating journals is saved in appropriate repositories. The current draft of the policy states: Partner journals A consortium of journals governs Dryad, guiding policy development and ensuring long-term sustainability. NESCent is a collaborative effort of Duke University, The University of North Carolina at Chapel Hill and North Carolina State University. Dryad is supported by NSF grants # EF , #DBI , and #DBI , and by IMLS grant #LG Submission system Dryads submission system is optimized for quick and easy submissions. Only a few pieces of information about a publication and dataset are required. However, users have the option to enter more detailed descriptions, making data easier for others to find and reuse (and thus more likely to receive subsequent citations). Modifications to DSpace The implementation of Dryad has required many changes to the core DSpace platform, including grouping of search results by publication and the ability to embargo datasets for up to one year. When these modifications meet the needs of the larger DSpace community, they are integrated into the core DSpace software. Handshaking with specialized databases For databases that are widely used by Dryads audience (e.g., TreeBASE and GenBank), Dryad will work with the database to mirror data submitted to the database and/or facilitate automatic deposit of Dryad material into the database. Harvesting and searching related content Using the OAI-PMH and OAI-ORE protocols from the Open Archives Initiative, Dryad will harvest content from related repositories, including the Knowledge Network for Biocomplexity and the Long Term Ecological Research Network. Harvested content can be searched alongside native Dryad content, providing a single place to search multiple related repositories. Dryad will also make use of the SRU searching standard to provide searching capabilities for content that cannot be harvested. Machine-readable interfaces Dryad will provide multiple interfaces for researchers and other systems to access content in Dryad. Content can be monitored via RSS feeds, searched via the SRU searching standard, and harvested via the OAI-PMH protocol. Basic search interface Dryad allows data to be searched using standard publication information such as title and authors. Searches can also include more detailed information, such as taxonomic names and geological timespans. GenBank TreeBase Dryad ccaattggct gttcttcgat tctggcgagt Repository: Project info: Source code: Journal integration Partner journals forward metadata about accepted publications to Dryad. Authors can import this information, greatly reducing the time required to submit data. When a submission is complete, Dryad returns information to the journal, allowing links from article web pages to related content in Dryad. Related projects The HIVE project is developing tools for integrating controlled vocabularies and ontologies with repositories. HIVE will integrate with the Dryad submission system. Dryad is a member of the DataONE consortium of repositories, which is developing tools for wide-scale data sharing, mirroring, and analysis. > requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as >. Data are important products of the scientific enterprise, and they should be preserved and usable for decades in the future. Authors may elect to have the data publicly available at time of publication, or, if the technology of the archive allows, may opt to embargo access to the data for a period up to a year after publication. Exceptions may be granted at the discretion of the editor, especially for sensitive information such as human subject data or the location of endangered species.