Presentation is loading. Please wait.

Presentation is loading. Please wait.

Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC.

Similar presentations


Presentation on theme: "Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC."— Presentation transcript:

1 Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC

2 Today’s Objectives  Why manage data?  Identify common data management issues  Best practices for managing data  Support: how the library and TTS can help you and your lab

3 What is Data? “Research data, unlike other types of information, is collected, observed, or created, for purposes of analysis to produce original research results” (University of Edinburgh). Observational Experimental Simulation data Derived or compiled data

4 Why Should I Manage it? Transparency & Integrity Compliance

5 Science & Personal Benefits Who uses your data now? Who COULD use your data? Shared/Open Data Scientific progress Impact on your career Citation counts

6 What if I Don’t Consider RDM? Data Sharing and Management Snafu in 3 Short Acts: A data management horror story by Karen Hanson, Alisa Surkis and Karen Yacobucci. http://www.youtube.com/watch?v=N2zK3sAtr-4

7 Seven “Issues” in Research Data Management Responsibility Data Management Plans Records Management File Management File Naming Metadata Backup and Security Ownership and Retention Long Term Planning

8 Issue: Responsibility Best Practices Define roles and assign responsibilities for data management Identify skills needed to perform tasks outlined in DMP and match to available staff Develop training plans for continuity Assign responsible parties and monitor results

9 Issue: Data Management Plans CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA GIVING ACCESS TO DATA RE-USING DATA Data Life Cycle

10 Creating a Data Management Plan “the types of data, samples, physical collections, software, curriculum materials, and other materials to be produced in the course of the project; the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies); policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements; policies and provisions for re-use, re-distribution, and the production of derivatives; and plans for archiving data, samples, and other research products, and for preservation of access to them”

11 Issue: Data Management Plans Best Practices What types of data will be created? Who will own, have access to, and be responsible for managing these data? What equipment and methods will be used to capture and process data? Where will data be stored during and after?

12 Issue: File Management Does this sound familiar? Inconsistently labeled files in multiple versions… inside poorly structured folders… stored on multiple media… in multiple locations… and in various formats…

13

14 Issue: File Naming Best Practices Avoid special characters in a file name. Use capitals or underscores instead of periods or spaces. Use 25 or fewer characters. Use documented & standardized descriptive information about the project/experiment. Use date format ISO 8601:YYYYMMDD. Include a version number.

15 Issue: File Naming

16 Best Practices Avoid special characters in a file name. Use capitals or underscores instead of periods or spaces. Use 25 or fewer characters. Use documented & standardized descriptive information about the project/experiment. Use date format ISO 8601:YYYYMMDD. Include a version number. Need Help? Contact metadataservices@tufts.edu Need Help? Contact metadataservices@tufts.edu

17 Issue: Metadata What is Metadata? “Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use or manage an information resource.” --2004, NISO, Understanding Metadata, pg. 1 A love note to the future… How will someone make sense of your data e.g. the cells and values of your spreadsheet? What universal or disciplinary standards could be used to label your data? How can you describe a data set to make it discoverable?

18 Why Use Metadata? find data from other researchers to support your research use the data that you do find help other professionals find and use data from your research use your own data in the future when you may have forgotten details of the research Help ensure consistency and clarity of data through the use of technical standards and controlled vocabularies

19 Common metadata fields Title Creator Identifier Subject Funders Rights Access information Language Dates Location Methodology Data processing Sources List of file names File Formats File structure Variable list Code lists Versions Checksums

20

21 What else? Standard conventions are used to describe content in a way that ensures units such as date, time, location, etc. are entered consistently among the researchers in your group Controlled vocabularies are lists of predefined terms that ensure consistency of use, and help disambiguate similar concepts. Use the controlled vocabulary that best matches your research. You might create a short list of terms to choose from when populating a specific piece of data For example, subject terms used in research about biometric sensing might be taken from a controlled vocabulary list such as Medical Subject Headings (MeSH)

22 Issue: Metadata Biology and health-specific metadata examples

23 Issue: Metadata Best Practices – Create a Data Dictionary Describe the contents of data files Define the parameters and the units on the parameter Explain the formats for dates, time, geographic coordinates, and other parameters Define any coded values Describe quality flags or qualifying values Define missing values Need Help? Contact metadataservices@tufts.edu Need Help? Contact metadataservices@tufts.edu

24 Metadata and the ELN Any searchable field in the Agilent or LabArchives ELN technically contains metadata In both ELNs, you can add tags/keywords to experiments, data files, and image files In some cases you can create a pre-defined list of tags/keywords to choose from

25 Agilent Searchable fields:

26 Agilent Funding Source via menu: Project Focus via menu:

27 Agilent Associate metadata with an experiment using keywords:

28 LabArchives Associate metadata with an experiment using tags:

29 LabArchives Associate keyword metadata with an image file:

30 Issue: Backup & Security How often should data be backed up? How many copies of data should you have? Where can you store your data? How much server space can I get?

31 Issue: Backup & Security Best Practices Make 3 copies (original + external/local + external/remote) Have them geographically distributed (local vs. remote) Use a Hard drive (e.g. Vista backup, Mac Timeline, UNIX rsync) or Tape backup system Cloud Storage - some examples of private sector storage resources include: (Amazon S3, Elephant Drive, Jungle Disk, Mozy, Carbonite) Unencrypted is ideal for storing your data because it will make it most easily read by you and others in the future…but if you do need to encrypt your data because of human subjects then: Keep passwords and keys on paper (2 copies), and in a PGP (pretty good privacy) encrypted digital file Uncompressed is also ideal for storage, but if you need to do so to conserve space, limit compression to your 3rd backup copy

32 Issue: Ownership & Retention How long is long enough?

33 Issue: Ownership & Retention Intellectual Property Policy IRB data retention policy Funders’ data retention policy Publishers’ data retention policy Federal and State laws

34 Issue: Long-Term Planning What will happen to my data after my project ends? How can I appraise the value of my data? What are my options for archiving and preserving my data? What are my options for publishing and sharing data?

35 Open vs. Proprietary Formats Used in Research Labs

36 Issue: Long-Term Planning Best Practices When choosing a file format, select a consistent format that can be read well into the future and is independent of changes in applications. Non-proprietary: Open, documented standard, Unencrypted, Uncompressed, ASCII formatted files will be readable into the future.

37 Works Cited Lamar Soutter Library, University of Massachusetts Medical School. 2014. “New England Collaborative Data Management Curriculum: Module 1.” http://library.umassmed.edu/necdmc. DataONE. 2013. “Best Practices for Data Management.” http://www.dataone.org/best-practices. MIT Libraries. 2013. “Data Management and Publishing.” MIT http://libraries.mit.edu/guides/subjects/data-management/index.html. Office of Research Integrity. 2013. “Data Management.” United States Department of Health and Human Services. United States Federal Government. http://ori.hhs.gov/education/products/rcradmin/topics/data/open.shtml. This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 3.0 United States License.Creative Commons Attribution- NonCommercial-ShareAlike 3.0 United States License

38 Learn More Data Management Principles & Education: Tufts Libraries Data Management Guide Research Data MANTRA DataONE: Best Practices UK Data Archives MIT Data Management and Publishing Guide Data Management Plans Digital Curation Centre DMPTool2 DataONE: Data Management Planning

39 Find Help Data Management Plans and Metadata services: Medford/Somerville Campus: [names/contact info] Boston/Grafton Campus: [librarian names/contact info] Data storage and security services + ELN support: [TTS contact info]


Download ppt "Research Data Management System project: Best Practices in Research Data Management* *Adaptation of the NECDMC."

Similar presentations


Ads by Google