UVa Library Research Data Services

Slides:



Advertisements
Similar presentations
OVERVIEW & LIBRARY SUPPORT FOR DATA MANAGEMENT/SHARING Jim Van Loon, MSME/MLIS Science Librarian.
Advertisements

Selecting a Data Sharing Repository. 2 Why Share Data? Enabling others to replicate and verify results as part of the scientific process Allows researchers.
Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh DIY Research Data Management Training Kit for.
Sherry Lake Data Management Consultant Research Data Services University of Virginia Library October 21, 2014
How to Write a Data Management Plan Gareth Cole, Data Curation Officer, Open Access Team.
Data Management Plans PAUL H. BERN, PH.D. APRIL 3, 2014.
Data Management What? Why? How?. 2 What do we mean by … Managing your Research (aka Data) … Ensuring physical integrity of files and helping to preserve.
NSF Data Management Plan Requirements Alex Kanous
Undertaken by the ………………………………
GRAD 521, Research Data Management Winter 2014 – Lecture 2 Amanda L. Whitmire, Asst. Professor.
Guidance on Preparing a Data Management Plan
DMPTool Expert Resources and Support for Data Management Planning Tao Zhang Michael Witt Purdue University Libraries 1.
+ Sarah Jones Digital Curation Centre Supporting researchers with Data Management Plans.
Africa RISING West Africa Mega Site M&E Activities Summary Africa RISING Project Steering Committee Meeting February 4, 2014; Bamako, Mali Beliyou Haile,
U.S. Department of the Interior U.S. Geological Survey Tutorials on Data Management Lesson 2: Data Management Planning CC image by Joe Hall on Flickr.
Open for ^ Business Research Data Services & Data Management Planning Ryan Schryver Wendt Commons is our.
U.S. Department of the Interior U.S. Geological Survey Planning for Data Management Creating data management plans for your project.
Why Should You Care about Managing Your Research? Sherry Lake and Bill Corey Data Management Consulting Group Research Data Services Purdom Lindblad Head.
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
Data Management: Documentation and Metadata for Engineering and Physical Sciences Ivey Glendon, Metadata Librarian Jeremy Bartczak, Intellectual Access.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
LTER Information Management Training Materials LTER Information Managers Committee Data Management Planning (adapted from DataONE training materials)
Data Management Planning
Data Management Planning Lesson 2: Data Management Planning CC image by Joe Hall on Flickr.
Data Archiving and Networked Services DANS is an institute of KNAW en NWO Data Archiving and Networked Services Introduction to Data Management Planning.
Data Management: Documentation & Metadata Sherry Lake, Senior Data Consultant Bill Corey, Data Consultant Jeremy Bartczak, Intellectual Access & Metadata.
A 40 Year Perspective Dr. Frank Scioli NSF-Retired.
Choosing Between Data Sharing Repositories for the Life Sciences Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Data Management 101 for Earth Scientists Data Management Plans Robert Cook Environmental Sciences Division Oak Ridge National Laboratory.
Data Management Planning. What is a DMP? A short plan that outlines  what data you will create and how  how you will manage it (storage, back-up, access…)
University Libraries/ITS Content Stewardship Program Mairéad Martin, Sr. Director, ITS Digital Library Technologies Presentation to FACAC March 1, 2011.
Changing Implementation of NSF Data Policy Dr. Jennifer M. Schopf, NSF OD/OIA/EPSCoR On behalf of the NSF Data Working Group March 17, 2011 CASC Spring.
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
DMPTool and Data Management Basics Hannah Norton July 29, 2014 Image modified from :
Data Management Plans Module 2. Data Management Plans Data Management Plans  What is a data management plan (DMP)?  Why prepare a DMP?  Components.
Data Management & the Library. FACT #1 Research is increasingly digital and produces digital data.
Elements of a Data Management Plan Bill Michener University of New Mexico
Primer on Data Management Data Management Plans Robert Cook Environmental Sciences Division Oak Ridge National Laboratory American Meteorological Society.
DOE Data Management Plan Requirements
Data Management Lesley A. Brown Director of Proposal Development.
Federal Funder open data and literature requirements January 15, 2016 RAWG Meeting.
Options for customising DMPonline Sarah Jones Digital Curation Centre, Glasgow DMPonline workshop, 9-10 November.
Data Management Plans PAUL H. BERN, PH.D. APRIL 3, 2014.
Issues in RDM This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0 International License.
Funded by: Data Management Planning Sarah Jones Digital Curation Centre Twitter: sjDCC.
Preserving your research data for future use This work is licensed under a Creative Commons Attribution 3.0 Unported License.Creative Commons Attribution.
C OLLEGE OF A GRICULTURE D ATA C OHORT D ATA M ANAGEMENT P LANNING J ANUARY 27, 2014 Jake Carlson Associate Professor of Library Science / Data Services.
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
Writing a Data Management Plan with the DMPTool Kathleen Fear January 15, 2015.
Writing a successful data management plan Kathleen Fear October 17, 2013.
DMPonline Adaption of template Sacha Zurcher & Stine Vejlebo Hansen RUb.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Writing a data management plan (DMP) Stephen Grace and David McElroy Writing a DMP workshop, UEL 5 March 2015.
Data Management Planning Sarah Jones & Joy Davidson Digital Curation Centre
Why do researchers need a Data Management Plan (DMP)? For all the same reasons you should take care of your data… To ensure that valuable data resources.
Jeff Moon Data Librarian &
Open Exeter Project Team
Open Access and Research Data Management: An Overview for LLOs
Data Management What? Why? How?.
Data Management 101 for Earth Scientists Data Management Plans
Data Management 101 for Earth Scientists Data Management Plans
CFI John R Evans Leaders Fund Digital Data Management
General Finnish DMP Guidance
Getting Started with Data Management
Research Data Management
Research data lifecycle²
Getting Started with Data Management & DMPTool
Research Data Dr Aoife Coffey, Research Data Coordinator
Presentation transcript:

UVa Library Research Data Services

Creating a Data Management Plan Sherry Lake Data Management Consultant University of Virginia Library shLake@virginia.edu November 3, 2014 © 2014 by the Rector and Visitors of the University of Virginia. This work is made available under the terms of the Creative Commons Attribution-ShareAlike 4.0 International license http://creativecommons.org/licenses/by-sa/4.0/

Road Map We’ll answer four questions in this workshop: What do we mean by data management? Why should you manage your data? What is a data management plan, and why do you need one? How do you create a data management plan? What is it, why you need one, and how to create one This workshop will cover how to create a data management plan using the DMPTool https://www.facebook.com/charlottesvillevirginia: Photo Instagrammer ihugtrees05

What do we mean by … Managing your data… Research Ensuring physical integrity of files and helping to preserve them Ensuring safety of content (data protection, ethics, morality, etc.) Describing the data (via metadata) and recording its history (provenance) Providing or enabling appropriate access at the right time, or restricting access, as appropriate Transferring custody at some point, and possibly destroying Research

Managing Data in the Data Life Cycle Choosing file formats Backup & storage File organization & naming conventions File format conversions Document all project/file details Version control Access control & security Sharing and preservation Simply put, data management is all of the activities necessary to make research data discoverable, accessible and understandable today, tomorrow, and well into the future And it is done throughout the lifecycle Organization: file formats, naming conventions, and version controls. Documentation: variable names and descriptions, code books to explain classification schemes and codes, algorithms used to transform the data, software (name and version) used to collect, view, or process the data. Storage: active storage-where the data collected is stored while the project is ongoing, who is responsible for managing it, backup schedule and location, data privacy and security concerns. Sharing: which data are you sharing, who is responsible for managing it, rules for access, intellectual property and/or licensing, data privacy and security concerns. Preservation: which data are you keeping and where Archiving: which data are you archiving, location(s), who is responsible for managing it, backups and redundancy, access rules, data privacy and security concerns.

(Good) Data Management… …helps research to be: Replicated and verified Preserved for future use Linked with other research products Shared and reused …helps researchers: Meet funding requirements Increase visibility of research Save time and effort (avoid data loss) Deal with an ever-increasing amount of data So what is Data Management? Enables data preservation -- makes preserving data for the future easier Supports sharing – you can focus on the research and not user requests; increases research impact Saves time – simplifies your research and increases your research efficiency Encourages better documentation – lets others understand your data Keeps funders happy – meets requirements But most of it, it allows you to focus your energy on the research, which is what you want to be doing! http://www.healthcare-informatics.com/article/guest-blog-data-management-challenge-unlocking-value-clinical-data-many-times-requires-enter

Who Cares about Data Management? From Flickr by Redden-McAllister www.rba.gov.au If have journal article, have record of what you did stored in journals,.. But the data underlying the results are really important, funders care Colleagues – potential collaborators Institutions Tenure committees more in the future. You: need to care you might need to go back to it in a few years… need good description. Future scientists – potentially use your data to discover important things. Need to be thinking about the future. (providing data for them) Slide from Carly Strasser http://www.slideshare.net/carlystrasser From Flickr by AJC1

What is a Data Management Plan? A comprehensive plan of how you will manage your research data throughout the lifecycle of your research project AND Brief description of how you will comply with funder’s data sharing policy Reviewed as part of a grant application A data management plan, or DMP, is a document that helps the researcher to deal with the data generated (or otherwise obtained) in a research project. From the funders viewpoint, a DMP is usually a document, or a section in another document, that is required to be submitted with a grant proposal that describes how you will comply with their data SHARING policy.

Types of Data Management Plans Document that is created to manage the data in you lab or project Document that is created at the start of a research project (required by funders or publishers) Plan for data sharing Plan based on funder specifications on how to manage your data There are several ways to think about a data management plan: A document that is created to manage the data in you lab or project. This is a ‘living document’ that is designed to evolve over time. It would cover the following topics: Description of research; Data source(s); Data collection, creation and analysis; Data administration; Data sharing; Archiving; Data documentation and metadata; and Budget. A typical data management plan, or handbook, might be as large as 50 pages. It would serve as a resource for the lab members, and could be used for training new members. A document that is created at the start of a research project, which describes the data to be collected, probable sizes and formats, collection and analysis methods and tools, software, instruments, processes, workflows, and storage and sharing options. It is the blueprint of the research project. A document which is required to be submitted to a funder as part of a grant proposal, and which describes specific data management procedures as specified by the funder.

Who’s Requiring Data Management? Require a Data Management Plan (DMP) Require Sharing of Results – per a Data Policy National Science Foundation (NSF) National Institutes of Health (NIH) National Oceanographic and Atmospheric Research (NOAA) Institute of Museum and Library Services (IMLS) National Endowment of Humanities – office of digital humanities (NEH) Andrew W. Mellon NASA NEH – Preservation & Access IES – Institute of Education Sciences Wellcome Trust Why do you need a data management plan? Read calls for proposals carefully and ask program director about specific data management requirements. Build time into your proposal development to formulate a data management plan! Private & public – in the US, UK and other countries Other agencies require sharing, but do not explicitly require a DMP as part of a proposal – NASA, NEH access & preservation NEH Sustainability of project deliverables and datasets – long term preservation Dissemination – sharing New NSF as of Jan. 2013 – Bio Sketch can include products of research This list is not inclusive.

What is in a Data Management Plan? What goes in a DMP? It depends on the type of document you are creating, and the purpose. If it is a DMP for a grant proposal, then it will address the solicitation. If it is a lab, or project-level DMP, then it will include all of the information that is important to the smooth and efficient running of that lab or project. There are several checklists available that can help you organize the information you will need for your DMP. The DCC (Digital Curation Centre in the UK) has a very thorough one available at http://www.dcc.ac.uk/sites/default/files/documents/resource/DMP_Checklist_2013.pdf MIT Libraries has another checklist at http://libraries.mit.edu/data-management/plan/checklist/ The Dataverse network, hosted by Harvard, has a checklist at http://thedata.org/book/data-management-plan-outline The UK Data Archive also has one for Sharing and Archiving your data at http://www.data-archive.ac.uk/create-manage/planning-for-sharing/data-management-checklist.

Parts of a (Generic) NSF Data Management Plan Products of the Research: The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project. Data Formats: The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies). Access to Data and Data Sharing Practices and Policies: Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements. Policies for Re-Use, Re-Distribution, and Production of Derivatives. Archiving of Data: Plans for archiving data, samples, and other research products, and for preservation of access to them. Things to keep in mind with the NSF: Every Directorate can have additional rules for any proposal. If the solicitation you are looking at doesn’t mention any additional rules for the DMP, it is always a good idea to go to the parent site and see if they list any additional requirements. For this solicitation, the Division is ‘Research on Learning in Formal and Informal Settings (DRL)’, and the ‘Directorate for Education and Human Resources’. Another, easier, method is to go to the ‘Dissemination and Sharing of Research Results’ page at http://www.nsf.gov/bfa/dias/policy/dmp.jsp, scroll down to the appropriate Directorate, Office, Division, Program or other unit, and see if your solicitation is listed. In this case, it is: EHR has a Directorate-wide Guidance document. http://www.nsf.gov/bfa/dias/policy/dmpdocs/ehr.pdf Even if the solicitation doesn’t refer to it, nor the Division, it is a good idea to follow the guidelines in it. Grant Proposal Guide (GPG) Chapter II.C.2.j http://www.nsf.gov/pubs/policydocs/pappguide/nsf13001/gpg_2.jsp#dmp 12

Department Of Energy Data Management Plan Data Types and Sources: A brief, high-level description of the data to be generated or used through the course of the proposed research and which of these are considered digital research data necessary to validate the research findings. Content and Format: A statement of plans for data and metadata content and format including, where applicable, a description of documentation plans, annotation of relevant software, and the rationale for the selection of appropriate standards. Sharing and Preservation: Means for sharing and the rationale for any restrictions and a timeline for sharing and preservation Protection: A statement of plans, where appropriate and necessary, to protect confidentiality, personal privacy, Personally Identifiable Information Rationale: A discussion of the rationale or justification for the proposed data management plan Software: Software and data created by funded research must be released with sufficient descriptions to facilitate the validation of research results. (Optional) II. DMPs should reflect relevant standards and community best practices for data and metadata, and make use of community accepted repositories whenever practicable. III. Data sharing means making data available to people other than those who have generated them. Data preservation means providing for the usability of data beyond the lifetime of the research activity that generated them. This is a BIG section on what to include. IV. Protection: DMPs must protect confidentiality, personal privacy, Personally Identifiable Information, and U.S. national, homeland, and economic security; recognize proprietary interests, business confidential information, and intellectual property rights; avoid significant negative impact on innovation, and U.S. competitiveness; and otherwise be consistent with all applicable laws, regulations, V. Rational: the potential impact of the data within the immediate field and in other fields, and any broader societal impact. ------ Suggested Elements for a Data Management Plan http://science.energy.gov/funding-opportunities/digital-data-management/suggested-elements-for-a-dmp/

How to Create a Data Management Plan? Step-by-step wizard for generating DMP Create | edit | re-use | share | save | generate Open to community Links to institutional resources Directorate information & updates Fourth -- How do I create a data management plan? So you are drafting a DMP for a specific funder – you intend to, or are already responding to, a solicitation for funding. What are the data management requirements for that funder? You need to identify what the funder requirements are for your solicitation. There are several ways to do this. We’ll look at one method, which is to go to the funding agencies website, look up the solicitation that you are submitting a research proposal to, and read the documentation to determine what documents must be submitted to ask for funding. As an example, we will look at two different funders websites: NSF, or National Science Foundation: Discovery Research K-12 (DRK-12). Solicitation 13-601. http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=500047 NSF funding website: http://www.nsf.gov/funding/ The left menu takes you to a link for the ‘Grant Proposal Guide’, an 80 page document you can download. The information about data management plans is in section 2, part “j” ‘Special Information and Supplementary Documentation’, pages 37-38. The left menu also takes you to the ‘Grant.gov Application Guide’, which takes you to another page, and then to a downloadable 64 page guide. Then you have to find the page you need, which is section 4, page 34, Data Management Plans. NSF Data Management & Sharing FAQs are at http://www.nsf.gov/bfa/dias/policy/dmpfaqs.jsp http://dmptool.org

http://dmptool.org HOME PAGE Top Menu DMP Requirements Log in select Uva from pull=down, log in via net badge Dashboard My Profile Notifications Orcid ID

HOME PAGE Top Menu DMP Requirements Log in select UVa from pull=down, log in via net badge Dashboard Help links & contact us Profile MyDMP’s Create New DMP My Profile Notifications Orcid ID MyDMPs Heading Visibility Select Template OR Copy public or your own DMP Select Template (open NSF) Overview screen (Visibility!!!) Details screen Go over outline relationship to right instructions/guidance/links STOP and talk about what to put in this box. DMP Preview (export and print) Review option – explain what that does.

Types of Data & Other Information Types of data produced Relationship to existing data How/when/where will the data be captured or created? How will the data be processed? Quality assurance & quality control measures Security: version control, backing up Who will be responsible for data management during/after project? Types: experimental, observational, raw or derived, physical collections, models, simulations, curriculum materials, software etc. How will data be collected? Are there tools or software needed to create/process/visualize the data? SciDaC Tip: Describe in general any descriptive or analytical statistics that will run on the data. OR Could include data generated by computer, data collected from sensors or instruments, images, audio files, video files, reports, surveys, patient records, and or other. Quality assurance & quality control measures Security: version control, backing up Who will be responsible for data management during/after project?

Data & Metadata Standards Identify the formats of data files created over the course of the project What metadata are needed to make the data meaningful? How will you create or capture these metadata? Why have you chosen particular standards and approaches for metadata? Data documentation (metadata) explains: How data was created What the data mean What the content & structure is What manipulations have taken place It ensures data understanding in the long-term Data documentation includes information on: The Project Data Collection Methods Structure of the data files Data sources used At the data-level, information on: Labels and descriptions for variables & records Codes and classifications Derived data algorithms

Policies for Access & Sharing Policies for Re-use & Re-distribution Are you under any obligation to share data? How, when, & where will you make the data available? What is the process for gaining access to the data? Who owns the copyright and/or intellectual property? Will you retain rights before opening data to wider use? How long? Are permission restrictions necessary? Embargo periods for political/commercial/patent reasons? Ethical and privacy issues? Who are the foreseeable data users? How should your data be cited? Embargo period: Does the original data collector/creator/principal investigator retain the right to use the data before opening it up to wider use? Are you under any obligation to share data? What is the process for gaining access to the data? How should your data be cited? Question we have been asking…. Who owns the data you collect during your research grant? See SciDaC guidelines….. Data Rights and Responsibilities Guidance On our web page Cover copyright, licensing if required. Who owns the copyright and/or intellectual property? Will you retain rights before opening data to wider use? How long? Embargo periods for political/commercial/patent reasons? Ethical and privacy issues? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. Who are the foreseeable data users?

Plans for Archiving & Preservation What data will be preserved for the long term? For how long? Where will data be preserved? What data transformations need to occur before preservation? What metadata or documentation will be submitted alongside the datasets? Who will be responsible for preparing data for preservation? Who will be the main contact person for the archived data? UVA policy states that “data will be preserved for a minimum of five years upon completion of the project” – explain if you’ll be preserving the data longer than five years Policy: Laboratory Notebook and Recordkeeping  https://policy.itc.virginia.edu/policy/policydisplay?id=RES-002 Places to archive your data: The University of Virginia is developing an institutional repository (Libra), which will serve as an ideal long-term storage facility for digital research data. Deposit in discipline specific repository Deposit in Institutional Repository Make accessible on online project web page Make accessible on institutional web site Informally on a peer-to-peer basis Submitting to a journal What data transformations need to occur before preservation? What metadata will be submitted alongside the datasets? Who will be responsible for preparing data for preservation? Who will be the main contact person for the archived data?

Questions and Discussion? You now have a pretty good idea of what data management is, why you should manage your data, what a DMP is, and how to create one. You’ve learned that it can be difficult to find all of the information you need on the agency websites. You’ve learned that it is very important to have all of the information from the funder. You’ve learned that the DMPTool is a good resource for funder information, and general and UVa-specific guidance and assistance. You’ve used the DMPTool to create a few DMPs.

Follow-up Contact the Data Management Consulting Group for help with DMP preparation Data Management during your project http://data.library.virginia.edu/data-management/dmp-support/ Email: DMConsult@virginia.edu