Presentation is loading. Please wait.

Presentation is loading. Please wait.

CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey https://www.surveymonkey.com/s/update 1.

Similar presentations


Presentation on theme: "CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey https://www.surveymonkey.com/s/update 1."— Presentation transcript:

1 CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey https://www.surveymonkey.com/s/update 1

2 Data Management Practices for Early Career Scientists: Closing Robert Cook ORNL Distributed Active Archive Center Environmental Sciences Division Oak Ridge National Laboratory Oak Ridge, TN cookrb@ornl.gov CC&E Joint Science Workshop College Park, MD April 19, 2015

3 CC&E Best Data Management Practices, April 19, 2015 Plan for archiving data “Begin with the end in mind” Identified the Data Center Collaborated with data center during project Communicated: Volume and Number of Files Special needs Delivery dates 3

4 CC&E Best Data Management Practices, April 19, 2015 Followed Fundamental Data Practices 4 Define the contents of your data files Define the variables Use consistent data organization Use stable file formats Assign descriptive file names Preserve processing information Perform basic quality assurance Provide documentation Protect your data Preserve your data

5 CC&E Best Data Management Practices, April 19, 2015 What to submit to the archive? Well-structured data files, with variables, units, and fill values well-defined Document that describes the data set Additional information –Article written with the data set –Files that describe project, protocols, or field sites (photographs) –Material from Project Web site or Wiki Basic description of the data (15 questions) – http://daac.ornl.gov/PI/questions.shtml 5

6 CC&E Best Data Management Practices, April 19, 2015 Issues with data sets received Descriptive information about data files and content is incomplete –Data description and collection method –Field sites –Quality / uncertainty of data Inconsistencies with publication Files uploaded are not identified / described Variable names are not defined or vague –“Height” unclear, change to “canopy_height” Perhaps append the method/sensor for added clarity 6

7 CC&E Best Data Management Practices, April 19, 2015 Information about Data (15 questions)15 questions Information About Your Data Set 1.Have you looked at our Best Data Management Practices 2.Who produced this data set? 3.What agency and program funded the project? What awards funded this project? (comma separate multiple awards) Data Set Description 4.Provide a title for your data set. (maximum 84 characters) What type of data does your data set contain? What does the data set describe? (2-3 sentences) 5.What parameters did you measure, derive, or generate? (comma separated, limit to ten) 6.Have you analyzed the uncertainty in your data? Briefly describe your uncertainty analysis. (2-3 sentences) Will the uncertainty estimates be included with your data set? 7

8 CC&E Best Data Management Practices, April 19, 2015 Information about Data (cont) Temporal and Spatial Characteristics 7.What date range does the data cover? (YYYY-MM-DD) What is a representative sampling frequency or temporal resolution for your data? 8.Where were the data collected/generated? 9.Which of the following best describes the spatial nature of your data? (single point, multiple points, transect, grid, polygon, n/a) 10.What is a representative spatial resolution for these data? 11.Provide a bounding box around your data. Data Preparation and Delivery 12.What are the formats of your data files? How many data files does your product contain? What is the total disk volume of your data set? (MB) 13.Is this data set final, unrestricted, and available for release? What are the reasons to restrict access to the data set? 14.Has this data set been described and used in a published paper? If so, provide a DOI or upload a digital copy of the manuscript with the data set. 15.Are the data and documentation posted on a public server? If so, provide the URL. 8

9 9  Exploration and Distribution –provide tools to explore, access, and extract data  Post-Project Data Support –provide long-term secure archiving –serve as a buffer between end users and PIs –provide usage statistics  Stewardship –security, disaster recovery –migration to new computer systems Data Center: Stewardship and Archive Functions  Ingest –perform QA checks –compile project-provided metadata –generate additional metadata –convert to archival file formats  Metadata / Documentation –prepare final metadata record and documentation  Archive / Release − generate citation and DOI (digital object identifier)

10 CC&E Best Data Management Practices, April 19, 2015 Workshop Goal Provide fundamental data management practices that investigators should perform during the course of data collection. 10 To improve the usability of data sets for: You Collaborators People outside your project By following the practices taught in this workshop, your data will be less prone to error, more efficiently structured for analysis, and more readily understandable for any future research.

11 CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey https://www.surveymonkey.com/r/72MJWGF 11

12 12 Workshop Sponsors


Download ppt "CC&E Best Data Management Practices, April 19, 2015 Please take the Workshop Survey https://www.surveymonkey.com/s/update 1."

Similar presentations


Ads by Google