21 April 2015. NIH + NSF Data Sharing Policies What is a Data Management Plan Accountability Data Products, Format, and Metadata Storage, Sharing, and.

Slides:



Advertisements
Similar presentations
The Role of the IRB An Institutional Review Board (IRB) is a review committee established to help protect the rights and welfare of human research subjects.
Advertisements

Depositing Data for Archiving Libby Bishop ESDS Qualidata, University of Essex Changing Families, Changing Food Meeting University of Sheffield 15 March.
Guidance on Preparing a Data Management Plan
NSF Data Management Plan
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Peter Griffith and Megan McGroddy 4 th NACP All Investigators Meeting February 3, 2013 Expectations and Opportunities for NACP Investigators to Share and.
Data Management Planning Kerry Miller Digital Curation Centre University of Edinburgh DIY Research Data Management Training Kit for.
Health Insurance Portability Accountability Act of 1996 HIPAA for Researchers: IRB Related Issues HSC USC IRB.
ICS 417: The ethics of ICT 4.2 The Ethics of Information and Communication Technologies (ICT) in Business by Simon Rogerson IMIS Journal May 1998.
NOAA Data Sharing Policy for Grants and Cooperative Agreements Ingrid Guch NOAA Environmental Data Management Committee.
ICPSR and the Data Seal of Approval: A Case Study Mary Vardigan Assistant Director, ICPSR October 8, 2013.
Recently Issued OHRP Documents: Guidance on Subject Withdrawal and Draft Revised FWA Secretary’s Advisory Committee on Human Research Protections October.
Data Management Plans PAUL H. BERN, PH.D. APRIL 3, 2014.
Developing a Records & Information Retention & Disposition Program:
Data Management What? Why? How?. 2 What do we mean by … Managing your Research (aka Data) … Ensuring physical integrity of files and helping to preserve.
NSF Data Management Plan Requirements Alex Kanous
Introduction to Intellectual Property using the Federal Acquisitions Regulations (FAR) To talk about intellectual property in government contracting, we.
Instructions and forms
1 Sharing Research Data in Hong Kong (position paper) Professor John Bacon-Shone Associate Director, Knowledge Exchange The University of Hong Kong Forum.
Guidance on Preparing a Data Management Plan
DMPTool Expert Resources and Support for Data Management Planning Tao Zhang Michael Witt Purdue University Libraries 1.
CUI Statistical: Collaborative Efforts of Federal Statistical Agencies Eve Powell-Griner National Center for Health Statistics.
EPSRC expectations on research data: What researchers need to know 12/03/2015 Masud Khokhar and Hardy Schwamm.
Policy on Data Stewardship, Access, and Retention Establishes University policy to assure that research data are appropriately maintained, archived for.
Providing Access to Your Data Matthew Mayernik National Center for Atmospheric Research Version 1.0 Review Date.
NIH Data Sharing Policy University of Nebraska Medical Center.
Responsible Conduct of Research (RCR) Farida Lada October 16, 2013
Open for ^ Business Research Data Services & Data Management Planning Ryan Schryver Wendt Commons is our.
U.S. Department of the Interior U.S. Geological Survey Planning for Data Management Creating data management plans for your project.
The importance of DART for funding agencies Dr. Ingrid Kissling-Näf.
World Data Center for Human Interactions in the Environment Needs Assessment for Managing and Preserving Geospatial Electronic Records: Preliminary Results.
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
Elements of a Data Management Plan Bill Michener University Libraries University of New Mexico Data Management Practices for.
UVa Library Research Data Services
Data Management Planning
Data Archiving and Networked Services DANS is an institute of KNAW en NWO Data Archiving and Networked Services Introduction to Data Management Planning.
A 40 Year Perspective Dr. Frank Scioli NSF-Retired.
Queensland University of Technology CRICOS No J The OAK Law Project Legal Issues in Data Management: A Practical Approach.
Data Management Lesley A. Brown Director of Proposal Development.
Providing Access to Your Data Matthew Mayernik National Center for Atmospheric Research Copyright 2012 Matthew Mayernik. Version 1.0 October 2012 Section:
Data Management and Accessibility S.M. Kaye PPPL Research Seminar 12/16/2013.
Changing Implementation of NSF Data Policy Dr. Jennifer M. Schopf, NSF OD/OIA/EPSCoR On behalf of the NSF Data Working Group March 17, 2011 CASC Spring.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
DMPTool and Data Management Basics Hannah Norton July 29, 2014 Image modified from :
Peter Granda Archival Assistant Director / Data Archives and Data Producers: A Cooperative Partnership.
Elements of a Data Management Plan Bill Michener University of New Mexico
Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.
Data Sharing in Nursing: What Researchers Need to Know November 9, 2015 Caitlin Bakker, Research Services Librarian |
The United States Department of Transportation. The United States Department of Transportation Public Access Plan is still under development and is subject.
DOE Data Management Plan Requirements
Data Management Lesley A. Brown Director of Proposal Development.
11 Researcher practice in data management Margaret Henty.
Copyright and Data Matthew Mayernik National Center for Atmospheric Research Section: Responsible Data Use Version 1.0 October 2012 Copyright 2012 Matthew.
Data Management Plans PAUL H. BERN, PH.D. APRIL 3, 2014.
Office of Science Statement on Digital Data Management Laura Biven, PhD Senior Science and Technology Advisor Office of the Deputy Director for Science.
VETERANS HEALTH ADMINISTRATION SLIDE 0 New Requirements for VA ORD Investigators: Implementation of Data Management and Access Plans.
Aalto Research Data Management Policy Ella Bingham 8 April 2016 This work is licensed under the Creative Commons Attribution 4.0 International License.
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
C OLLEGE OF A GRICULTURE D ATA C OHORT D ATA M ANAGEMENT P LANNING J ANUARY 27, 2014 Jake Carlson Associate Professor of Library Science / Data Services.
Using the DMPTool for data management plans Kathleen Fear February 27, 2014.
Writing a Data Management Plan with the DMPTool Kathleen Fear January 15, 2015.
DMPonline Adaption of template Sacha Zurcher & Stine Vejlebo Hansen RUb.
Data Management Planning Joy Davidson
HIPAA Training Workshop #3 Individual Rights Kaye L. Rankin Rankin Healthcare Consultants, Inc.
Jeff Moon Data Librarian &
Slides Template for Module 5
Data Management What? Why? How?.
General Finnish DMP Guidance
Research Data Management
Fundamental Science Practices (FSP) of the U.S. Geological Survey
Presentation transcript:

21 April 2015

NIH + NSF Data Sharing Policies What is a Data Management Plan Accountability Data Products, Format, and Metadata Storage, Sharing, and Metadata Budgeting for Data Management Resources

 Applies to the sharing of final research data* for research purposes.  Applies to basic research, clinical studies, surveys, and other types of research supported by NIH and to research that involves human subjects and laboratory research that does not involve human subjects.  Applies to applicants seeking $500,000 or more in direct costs in any year of the proposed project period through grants, cooperative agreements, or contracts.  Applies to research applications submitted beginning October 1, * Final Research Data - Recorded factual material commonly accepted in the scientific community as necessary to document and support research findings. This does not mean summary statistics or tables. It means the data on which summary statistics and tables are based. For the purposes of this policy, final research data do not include laboratory notebooks, partial datasets, preliminary analyses, drafts of scientific papers, plans for future research, peer review reports, communications with colleagues, or physical objects, such as gels or laboratory specimens.

 “Investigators are expected to share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants.”  “Grantees are expected to encourage and facilitate such sharing.”  Data refers to any information that can be stored in digital form, including text, numbers, images, video or movies, audio, software, algorithms, equations, animations, models, simulations, etc. Such data may be generated by various means including observation, computation, or experiment  Applies to research applications submitted on or after January 18, 2011.

 Research data collections  Products of one or a few focused research projects  Resource or community data collections  Serve a specific research community  Typically fall between research and reference data collections in size, scale, funding, community of users, and duration  Conform to community standards  Reference data collections  Serve large segments of the research and education communities  Conform to robust and comprehensive standards

 An opportunity for PIs to articulate how they will conform to the FEDERAL data sharing policy for research results.  The DMP is reviewed as an integral part of the proposal, coming under ‘Intellectual Merit’ or ‘Broader Impacts’ or both, as appropriate for the scientific community of relevance.  Data management requirements and plans may change across specific Directorates, Offices, Divisions, Programs, or other NSF/NIH units.

 The types of data, samples, physical collections, software, curriculum materials, publications, and other materials to be produced in the course of the project;  The standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies);  Policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements;  Policies and provisions for re-use, re-distribution, and the production of derivatives; and  Plans for archiving data, samples, and other research products, and for preservation of access to them.

ElementDescription?NSF Mapping Data descriptionA description of the information to be gathered; the nature and scale of the data that will be generated or collected.YesExpected Data Existing dataA survey of existing data relevant to the project and a discussion of whether and how these data will be integrated.YesExpected Data Format Formats in which the data will be generated, maintained, and made available, including a justification for the procedural and archival appropriateness of those formats. YesData Format and Dissemination MetadataA description of the metadata to be provided along with the generated data, and a discussion of the metadata standards used.YesData Format and Dissemination Storage and backup Storage methods and backup procedures for the data, including the physical and cyber resources and facilities that will be used for the effective preservation and storage of the research data. Yes Data Storage and Preservation of Access Security A description of technical and procedural protections for information, including confidential information, and how permissions, restrictions, and embargoes will be enforced. YesData Format and Dissemination ResponsibilityNames of the individuals responsible for data management in the research project.YesRoles and Responsibility Intellectual property rights Entities or persons who will hold the intellectual property rights to the data, and how IP will be protected if necessary. Any copyright constraints (e.g., copyrighted data collection instruments) should be noted. YesData Format and Dissemination Access and sharing A description of how data will be shared, including access procedures, embargo periods, technical mechanisms for dissemination and whether access will be open or granted only to specific user groups. A timeframe for data sharing and publishing should also be provided. Yes Data Storage and Preservation of Access AudienceThe potential secondary users of the data.YesData Format and Dissemination Selection and retention periods A description of how data will be selected for archiving, how long the data will be held, and plans for eventual transition or termination of the data collection in the future. YesPeriod of Data Retention Archiving and preservation The procedures in place or envisioned for long-term archiving and preservation of the data, including succession plans for the data should the expected archiving entity go out of existence. Yes Data Storage and Preservation of Access Ethics and privacy A discussion of how informed consent will be handled and how privacy will be protected, including any exceptional arrangements that might be needed to protect participant confidentiality, and other ethical issues that may arise. YesData Format and Dissemination Budget The costs of preparing data and documentation for archiving and how these costs will be paid. Requests for funding may be included. Data organizationHow the data will be managed during the project, with information about version control, naming conventions, etc. Quality AssuranceProcedures for ensuring data quality during the project. Legal requirementsA listing of all relevant federal or funder requirements for data management and data sharing.

 Explains how the responsibilities regarding the management of your data will be delegated.  Time allocations  Project management of technical aspects  Training requirements  Contributions of non-project staff - individuals should be named where possible(custodians of the repository/archive you choose to store your data

 Outlines the staff/organizational roles and responsibilities for implementing this data management plan.  Who will be responsible for data management and for monitoring the data management plan?  How will adherence to this data management plan be checked or demonstrated?  What process is in place for transferring responsibility for the data?  Who will have responsibility over time for decisions about the data once the original personnel are no longer available?

 Is the data regulated by policy or law?  Are there legal constraints (e.g., HIPAA) on sharing data?  How will you handle informed consent with respect to communicating to respondents that the information they provide will remain confidential when data are shared or made available for secondary analysis?  Determine constraints if classified data, specific handling requirements, IRB/human subject research  If yes, how will you comply with these constraints?  Write your compliance plan point by point  If applicable, how will you manage disclosure risk in the data to be shared and archived?  Is there intellectual property (e.g., patent, copyright) rights on the datasets?  Determine restrictions and conditions to share and disseminate  Does someone else own the data? What are their conditions for use, sharing, and dissemination?

 Determine DMPs as established by any international research consortia or set forth in formal science and technology agreements signed by the United States Government and foreign counterparts.  This should be addressed with any international research partners when first planning a collaboration.  Talk to the Program Officer for additional assistance.

 Inputs and outputs (existing, intermediary, and final datasets)  Existing data and sources you are using (Digital and physical collections)  Quantitative Social and Economic Data Sets  Numeric data sets, geospatial data, spatio-temporal data  Qualitative Information  Microfilms, historical documents, oral interviews, video tapes, hand written records, transcripts, tables, figures, flowcharts, 3D models, digital audio  Experimental Research  Tabulated data  Mathematical and Computer Models  May include descriptions in published articles or fully documented and robust versions of these models

 Determine formats and estimated size, and if it will be shared  Formats: RTF text, MS Excel converted to CSV, MATLAB, PNG (images), WAV audio, MPEG video, shapefile, as well as any instrument-specific formats or software Size/amount: Rate produced, e.g., 1 TB/year, 50GB/experiment  Metadata should be machine readable for better re-usability and processing. HINT: Sketching a diagram of data workflow helps to identify datasets and issues re their management.

 Give a short description of what "data" will mean in your research  What data will be generated in the research?  What data types will you be creating or capturing?  How will you capture or create the data?  If you will be using existing data, state that fact and include where you got it.  What is the relationship between the data you are collecting and the existing data?  What data will be preserved and shared?

 “Data about data”  Typical functions  Discovery tool  Rights management  Version identification  Certify authenticity  Status indicator  Defines content structure  Interoperability  Situates geospatially  Process descriptions  Access and transfer ObjectivesDomainsArchitecture Objectives Principles Discipline Genre Format Structure Extent Granularity

 What details (metadata) are necessary for others to use your data?  List standards for formats or metadata for your datasets.  Document why you selected them  Describe the method by which metadata will be generated.  Document naming conventions/schema for your data.  List the data dictionaries/taxonomies/ontologies you will use for your data.  Describe how you will track versions of the datasets.  List and describe the tools that are necessary to use the datasets.

 OAIS, Open Archival Information System OAIS  CSDGM, Content Standard for Digital Geospatial Metadata CSDGM  ICPSR, Inter-university Consortium for Political and Social Research ICPSR  DDI, Data Documentation Initiative** DDI  best practices: data life cycle and longitudinal datadata life cyclelongitudinal data  SDMX, Statistics Data and Metadata Exchange SDMX  XML, Extensible Markup Language XML

 Citation is the preferred form of acknowledgement  Should include a doi to establish authouritative data source or a PURL (Persistent Uniform Resource Location)  Citation: Involuntary Commitment Data, public use dataset [restricted use data, if appropriate]. Produced and distributed by the PSRDC, College of Behavioral and Community Sciences, University of South Florida (year data were downloaded). URL Acknowledgement: The collection of data used in this study was partly supported by the National Institutes of Health under grant number R01 HD and the National Science Foundation under award number

 Document which of the digital or non-digital datasets listed will NOT be stored or retained during the project.  Document the type of media and the location(s) where the data will be stored and who is responsible.  Document how and where the data will be backed up and who is responsible.  Document any access controls for data and/or data transfers that need to be secured and how these controls will be applied.

 Indicate which datasets used or generated will be shared  Indicate which any datasets are in proprietary formats and if they will be converted to a non-proprietary format for sharing.  Determine the audience who will use the datasets.  Determine acknowledgement protocol  Determine sharing protocols: open access or release upon request.  Account for any delay in the accessibility of your data after your research is done.  Explain details of any embargo periods.  Determine how long will data be kept beyond the life of the project  Will a third-party service be used to archive or release data?  Set a release date to share the data.  Describe any restrictions on use, sharing, repurposing, etc. of datasets  Include costs of any additional resources (3 rd party services, etc.) in budget.

 Under the auspices of the PI  Data archive: A place where machine-readable data are acquired, manipulated, documented, and finally distributed to the scientific community for further analysis.  Data enclave: A controlled, secure environment in which eligible researchers can perform analyses using restricted data* resources.restricted data  Mixed mode sharing. **Restricted Data - datasets that cannot be distributed to the general public, because of, for example, participant confidentiality concerns, third- party licensing or use agreements, or national security considerations.

 Builds upon storage by taking additional steps toward preserving digital files.  Safeguards data against file corruption of storage media.  Includes updating from obsolete formats.  Often includes enhanced discovery and access of datasets.  Includes a preservation strategy and disaster recovery plan.  Often handled by an third-party archiving service or data repository.  Check university guidelines.  Include deposit fees in budget.

Example 1 The proposed research will involve a small sample (less than 20 subjects) recruited from clinical facilities in the New York City area with Williams syndrome. This rare craniofacial disorder is associated with distinguishing facial features, as well as mental retardation. Even with the removal of all identifiers, we believe that it would be difficult if not impossible to protect the identities of subjects given the physical characteristics of subjects, the type of clinical data (including imaging) that we will be collecting, and the relatively restricted area from which we are recruiting subjects. Therefore, we are not planning to share the data. Example 2 The proposed research will include data from approximately 500 subjects being screened for three bacterial sexually transmitted diseases (STDs) at an inner city STD clinic. The final dataset will include self-reported demographic and behavioral data from interviews with the subjects and laboratory data from urine specimens provided. Because the STDs being studied are reportable diseases, we will be collecting identifying information. Even though the final dataset will be stripped of identifiers prior to release for sharing, we believe that there remains the possibility of deductive disclosure of subjects with unusual characteristics. Thus, we will make the data and associated documentation available to users only under a data-sharing agreement that provides for: (1) a commitment to using the data only for research purposes and not to identify any individual participant; (2) a commitment to securing the data using appropriate computer technology; and (3) a commitment to destroying or returning the data after analyses are completed. Example 3 This application requests support to collect public-use data from a survey of more than 22,000 Americans over the age of 50 every 2 years. Data products from this study will be made available without cost to researchers and analysts at User registration is required in order to access or download files. As part of the registration process, users must agree to the conditions of use governing access to the public release data, including restrictions against attempting to identify study participants, destructionhttps://ssl.isr.umich.edu/hrs/

 It is acceptable to state in the DMP that the project is not anticipated to generate data or samples that require management and/or sharing.  PIs should note that the statement will be subject to peer review.  If data you generate is owned by your institution, the data access plan must address the institutional strategy for providing access to relevant data and supporting materials.  Open-access publishing is not addressed in the implementation of the data management plan requirement.

 Documenting  Preparing  Publishing  Disseminating  Sharing research findings and supporting material  Data sharing and archiving NOTE: If the data have been collected already, a competitive or administrative supplement may be available. Reports Reprints Page charges or other journal costs Does not cover costs for prior or early publication Illustrations Cleanup Documentation Storage and indexing of data and databases Development, documentation and debugging of software Storage, preservation, documentation, indexing, etc., of physical specimens, collections or fabricated items. Types of Activities Covered

 DMPTool (Argonne Laboratories)Argonne Laboratories  NIH  Data Sharing Policy and Implementation Guidance Data Sharing Policy and Implementation Guidance  8.2 Availability of Research Results 8.2 Availability of Research Results  NSF  NSF Data Sharing Policy NSF Data Sharing Policy  NSF Data Management Plan Requirements NSF Data Management Plan Requirements  NSF Social, Behavioral and Economic (SBE) Directorate-wide Guidance NSF Social, Behavioral and Economic (SBE) Directorate-wide Guidance  ICPSR  Effective Data Management Effective Data Management  Databib  Registry of Research Data Repositories Registry of Research Data Repositories  DataONE  Best Practices Best Practices