Presentation is loading. Please wait.

Presentation is loading. Please wait.

UVa Library Research Data Services

Similar presentations


Presentation on theme: "UVa Library Research Data Services"— Presentation transcript:

1 UVa Library Research Data Services

2 Creating a Data Management Plan
Sherry Lake Data Management Consultant University of Virginia Library November 3, 2014 © 2014 by the Rector and Visitors of the University of Virginia. This work is made available under the terms of the Creative Commons Attribution-ShareAlike 4.0 International license

3 Road Map We’ll answer four questions in this workshop:
What do we mean by data management? Why should you manage your data? What is a data management plan, and why do you need one? How do you create a data management plan? What is it, why you need one, and how to create one This workshop will cover how to create a data management plan using the DMPTool Photo Instagrammer ihugtrees05

4 What do we mean by … Managing your data… Research
Ensuring physical integrity of files and helping to preserve them Ensuring safety of content (data protection, ethics, morality, etc.) Describing the data (via metadata) and recording its history (provenance) Providing or enabling appropriate access at the right time, or restricting access, as appropriate Transferring custody at some point, and possibly destroying Research

5 Managing Data in the Data Life Cycle
Choosing file formats Backup & storage File organization & naming conventions File format conversions Document all project/file details Version control Access control & security Sharing and preservation Simply put, data management is all of the activities necessary to make research data discoverable, accessible and understandable today, tomorrow, and well into the future And it is done throughout the lifecycle Organization: file formats, naming conventions, and version controls. Documentation: variable names and descriptions, code books to explain classification schemes and codes, algorithms used to transform the data, software (name and version) used to collect, view, or process the data. Storage: active storage-where the data collected is stored while the project is ongoing, who is responsible for managing it, backup schedule and location, data privacy and security concerns. Sharing: which data are you sharing, who is responsible for managing it, rules for access, intellectual property and/or licensing, data privacy and security concerns. Preservation: which data are you keeping and where Archiving: which data are you archiving, location(s), who is responsible for managing it, backups and redundancy, access rules, data privacy and security concerns.

6 (Good) Data Management…
…helps research to be: Replicated and verified Preserved for future use Linked with other research products Shared and reused …helps researchers: Meet funding requirements Increase visibility of research Save time and effort (avoid data loss) Deal with an ever-increasing amount of data So what is Data Management? Enables data preservation -- makes preserving data for the future easier Supports sharing – you can focus on the research and not user requests; increases research impact Saves time – simplifies your research and increases your research efficiency Encourages better documentation – lets others understand your data Keeps funders happy – meets requirements But most of it, it allows you to focus your energy on the research, which is what you want to be doing!

7 Who Cares about Data Management?
From Flickr by Redden-McAllister If have journal article, have record of what you did stored in journals,.. But the data underlying the results are really important, funders care Colleagues – potential collaborators Institutions Tenure committees more in the future. You: need to care you might need to go back to it in a few years… need good description. Future scientists – potentially use your data to discover important things. Need to be thinking about the future. (providing data for them) Slide from Carly Strasser From Flickr by AJC1

8 What is a Data Management Plan?
A comprehensive plan of how you will manage your research data throughout the lifecycle of your research project AND Brief description of how you will comply with funder’s data sharing policy Reviewed as part of a grant application A data management plan, or DMP, is a document that helps the researcher to deal with the data generated (or otherwise obtained) in a research project. From the funders viewpoint, a DMP is usually a document, or a section in another document, that is required to be submitted with a grant proposal that describes how you will comply with their data SHARING policy.

9 Types of Data Management Plans
Document that is created to manage the data in you lab or project Document that is created at the start of a research project (required by funders or publishers) Plan for data sharing Plan based on funder specifications on how to manage your data There are several ways to think about a data management plan: A document that is created to manage the data in you lab or project. This is a ‘living document’ that is designed to evolve over time. It would cover the following topics: Description of research; Data source(s); Data collection, creation and analysis; Data administration; Data sharing; Archiving; Data documentation and metadata; and Budget. A typical data management plan, or handbook, might be as large as 50 pages. It would serve as a resource for the lab members, and could be used for training new members. A document that is created at the start of a research project, which describes the data to be collected, probable sizes and formats, collection and analysis methods and tools, software, instruments, processes, workflows, and storage and sharing options. It is the blueprint of the research project. A document which is required to be submitted to a funder as part of a grant proposal, and which describes specific data management procedures as specified by the funder.

10 Who’s Requiring Data Management?
Require a Data Management Plan (DMP) Require Sharing of Results – per a Data Policy National Science Foundation (NSF) National Institutes of Health (NIH) National Oceanographic and Atmospheric Research (NOAA) Institute of Museum and Library Services (IMLS) National Endowment of Humanities – office of digital humanities (NEH) Andrew W. Mellon NASA NEH – Preservation & Access IES – Institute of Education Sciences Wellcome Trust Why do you need a data management plan? Read calls for proposals carefully and ask program director about specific data management requirements. Build time into your proposal development to formulate a data management plan! Private & public – in the US, UK and other countries Other agencies require sharing, but do not explicitly require a DMP as part of a proposal – NASA, NEH access & preservation NEH Sustainability of project deliverables and datasets – long term preservation Dissemination – sharing New NSF as of Jan – Bio Sketch can include products of research This list is not inclusive.

11 What is in a Data Management Plan?
What goes in a DMP? It depends on the type of document you are creating, and the purpose. If it is a DMP for a grant proposal, then it will address the solicitation. If it is a lab, or project-level DMP, then it will include all of the information that is important to the smooth and efficient running of that lab or project. There are several checklists available that can help you organize the information you will need for your DMP. The DCC (Digital Curation Centre in the UK) has a very thorough one available at MIT Libraries has another checklist at The Dataverse network, hosted by Harvard, has a checklist at The UK Data Archive also has one for Sharing and Archiving your data at

12 Parts of a (Generic) NSF Data Management Plan
Products of the Research: The types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project. Data Formats: The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies). Access to Data and Data Sharing Practices and Policies: Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements. Policies for Re-Use, Re-Distribution, and Production of Derivatives. Archiving of Data: Plans for archiving data, samples, and other research products, and for preservation of access to them. Things to keep in mind with the NSF: Every Directorate can have additional rules for any proposal. If the solicitation you are looking at doesn’t mention any additional rules for the DMP, it is always a good idea to go to the parent site and see if they list any additional requirements. For this solicitation, the Division is ‘Research on Learning in Formal and Informal Settings (DRL)’, and the ‘Directorate for Education and Human Resources’. Another, easier, method is to go to the ‘Dissemination and Sharing of Research Results’ page at scroll down to the appropriate Directorate, Office, Division, Program or other unit, and see if your solicitation is listed. In this case, it is: EHR has a Directorate-wide Guidance document. Even if the solicitation doesn’t refer to it, nor the Division, it is a good idea to follow the guidelines in it. Grant Proposal Guide (GPG) Chapter II.C.2.j 12

13 Department Of Energy Data Management Plan
Data Types and Sources: A brief, high-level description of the data to be generated or used through the course of the proposed research and which of these are considered digital research data necessary to validate the research findings. Content and Format: A statement of plans for data and metadata content and format including, where applicable, a description of documentation plans, annotation of relevant software, and the rationale for the selection of appropriate standards. Sharing and Preservation: Means for sharing and the rationale for any restrictions and a timeline for sharing and preservation Protection: A statement of plans, where appropriate and necessary, to protect confidentiality, personal privacy, Personally Identifiable Information Rationale: A discussion of the rationale or justification for the proposed data management plan Software: Software and data created by funded research must be released with sufficient descriptions to facilitate the validation of research results. (Optional) II. DMPs should reflect relevant standards and community best practices for data and metadata, and make use of community accepted repositories whenever practicable. III. Data sharing means making data available to people other than those who have generated them. Data preservation means providing for the usability of data beyond the lifetime of the research activity that generated them. This is a BIG section on what to include. IV. Protection: DMPs must protect confidentiality, personal privacy, Personally Identifiable Information, and U.S. national, homeland, and economic security; recognize proprietary interests, business confidential information, and intellectual property rights; avoid significant negative impact on innovation, and U.S. competitiveness; and otherwise be consistent with all applicable laws, regulations, V. Rational: the potential impact of the data within the immediate field and in other fields, and any broader societal impact. ------ Suggested Elements for a Data Management Plan

14 How to Create a Data Management Plan?
Step-by-step wizard for generating DMP Create | edit | re-use | share | save | generate Open to community Links to institutional resources Directorate information & updates Fourth -- How do I create a data management plan? So you are drafting a DMP for a specific funder – you intend to, or are already responding to, a solicitation for funding. What are the data management requirements for that funder? You need to identify what the funder requirements are for your solicitation. There are several ways to do this. We’ll look at one method, which is to go to the funding agencies website, look up the solicitation that you are submitting a research proposal to, and read the documentation to determine what documents must be submitted to ask for funding. As an example, we will look at two different funders websites: NSF, or National Science Foundation: Discovery Research K-12 (DRK-12). Solicitation NSF funding website: The left menu takes you to a link for the ‘Grant Proposal Guide’, an 80 page document you can download. The information about data management plans is in section 2, part “j” ‘Special Information and Supplementary Documentation’, pages The left menu also takes you to the ‘Grant.gov Application Guide’, which takes you to another page, and then to a downloadable 64 page guide. Then you have to find the page you need, which is section 4, page 34, Data Management Plans. NSF Data Management & Sharing FAQs are at

15 http://dmptool.org HOME PAGE Top Menu DMP Requirements Log in
select Uva from pull=down, log in via net badge Dashboard My Profile Notifications Orcid ID

16 HOME PAGE Top Menu DMP Requirements Log in select UVa from pull=down, log in via net badge Dashboard Help links & contact us Profile MyDMP’s Create New DMP My Profile Notifications Orcid ID MyDMPs Heading Visibility Select Template OR Copy public or your own DMP Select Template (open NSF) Overview screen (Visibility!!!) Details screen Go over outline relationship to right instructions/guidance/links STOP and talk about what to put in this box. DMP Preview (export and print) Review option – explain what that does.

17 Types of Data & Other Information
Types of data produced Relationship to existing data How/when/where will the data be captured or created? How will the data be processed? Quality assurance & quality control measures Security: version control, backing up Who will be responsible for data management during/after project? Types: experimental, observational, raw or derived, physical collections, models, simulations, curriculum materials, software etc. How will data be collected? Are there tools or software needed to create/process/visualize the data? SciDaC Tip: Describe in general any descriptive or analytical statistics that will run on the data. OR Could include data generated by computer, data collected from sensors or instruments, images, audio files, video files, reports, surveys, patient records, and or other. Quality assurance & quality control measures Security: version control, backing up Who will be responsible for data management during/after project?

18 Data & Metadata Standards
Identify the formats of data files created over the course of the project What metadata are needed to make the data meaningful? How will you create or capture these metadata? Why have you chosen particular standards and approaches for metadata? Data documentation (metadata) explains: How data was created What the data mean What the content & structure is What manipulations have taken place It ensures data understanding in the long-term Data documentation includes information on: The Project Data Collection Methods Structure of the data files Data sources used At the data-level, information on: Labels and descriptions for variables & records Codes and classifications Derived data algorithms

19 Policies for Access & Sharing Policies for Re-use & Re-distribution
Are you under any obligation to share data? How, when, & where will you make the data available? What is the process for gaining access to the data? Who owns the copyright and/or intellectual property? Will you retain rights before opening data to wider use? How long? Are permission restrictions necessary? Embargo periods for political/commercial/patent reasons? Ethical and privacy issues? Who are the foreseeable data users? How should your data be cited? Embargo period: Does the original data collector/creator/principal investigator retain the right to use the data before opening it up to wider use? Are you under any obligation to share data? What is the process for gaining access to the data? How should your data be cited? Question we have been asking…. Who owns the data you collect during your research grant? See SciDaC guidelines….. Data Rights and Responsibilities Guidance On our web page Cover copyright, licensing if required. Who owns the copyright and/or intellectual property? Will you retain rights before opening data to wider use? How long? Embargo periods for political/commercial/patent reasons? Ethical and privacy issues? If you are planning on restricting access, use or dissemination of the data, you must explain in this section how you will codify and communicate these restrictions. Who are the foreseeable data users?

20 Plans for Archiving & Preservation
What data will be preserved for the long term? For how long? Where will data be preserved? What data transformations need to occur before preservation? What metadata or documentation will be submitted alongside the datasets? Who will be responsible for preparing data for preservation? Who will be the main contact person for the archived data? UVA policy states that “data will be preserved for a minimum of five years upon completion of the project” – explain if you’ll be preserving the data longer than five years Policy: Laboratory Notebook and Recordkeeping  Places to archive your data: The University of Virginia is developing an institutional repository (Libra), which will serve as an ideal long-term storage facility for digital research data. Deposit in discipline specific repository Deposit in Institutional Repository Make accessible on online project web page Make accessible on institutional web site Informally on a peer-to-peer basis Submitting to a journal What data transformations need to occur before preservation? What metadata will be submitted alongside the datasets? Who will be responsible for preparing data for preservation? Who will be the main contact person for the archived data?

21 Questions and Discussion?
You now have a pretty good idea of what data management is, why you should manage your data, what a DMP is, and how to create one. You’ve learned that it can be difficult to find all of the information you need on the agency websites. You’ve learned that it is very important to have all of the information from the funder. You’ve learned that the DMPTool is a good resource for funder information, and general and UVa-specific guidance and assistance. You’ve used the DMPTool to create a few DMPs.

22 Follow-up Contact the Data Management Consulting Group for help with
DMP preparation Data Management during your project


Download ppt "UVa Library Research Data Services"

Similar presentations


Ads by Google