Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Management: Documentation & Metadata

Similar presentations


Presentation on theme: "Data Management: Documentation & Metadata"— Presentation transcript:

1 Data Management: Documentation & Metadata
General Overview

2 Research Life Cycle Data Life Cycle Data Collection Analysis Sharing
Re- Purpose Re-Use Deposit Data Collection Analysis Sharing Proposal Planning Writing Discovery End of Project Archive Start Up It is important to begin to document your data at the very beginning of your research project and continue throughout the project. By doing so will make data documentation easier and reduce the likelihood that you will forget aspects of your data later in the research project. Don’t wait until the end to start to document your research project and its data! In order for the data to be used properly once it’s been archived the data must be documented. Data documentation (otherwise known as Metadata) enables you to understand the data in detail, enable others to find it, use it and properly cite it. It’s all about re-use, for you or someone else: When you provide data to someone else, what types of information would you want to include with the data? When you receive a dataset from an external source, what types of details do you want to know about the data? Reproducibility! (Dryad) Submitters should aim to provide sufficient data and descriptive information such that another researcher would be able to evaluate the findings described in the publication. This will generally include any data that are used in statistical tests, as well the individual data points behind published figures and tables.

3 Data Documentation (Metadata)
Informal or formal methods to describe your data Important if you want to reuse your own data in the future Also necessary when sharing your data Metadata and associated documentation is absolutely crucial for any potential use or reuse of data; no one can responsibly re-use or interpret data without accompanying compliant and standardized metadata or documentation. Metadata describe your data so that others can understand what your data set represents; they are thought of as "data about the data" or the "who, what, where, when, and why" of the data. Metadata should be written from the standpoint of someone reading it who is unfamiliar with your project, methods, or observations. What does a user, 20 years into the future, need to know to use your data properly? Informal something like a ReadMe file. Formal is use of a structured like a data dictionary, codebook, or metadata. Different disciplines may have format standards. Informal is better than nothing

4 Working with Data When you provide data to someone else, what types of information would you want to include with the data? When you receive a dataset from an external source, what types of details do you want to know about the data? If you were to share your data, what type of information would be most useful to understand the data set? Alternatively, when receiving data from an external source, what information is needed to understand the data set? Metadata contains information about the dataset that allows it to be understood when shared amongst scientists. When sharing data, some considerations include: - why the data was created; - what limitations, if any, the data have;. - what the data means; and who should be cited if someone publishes something that utilized the data. When receiving data from an alternative source, consider: What are the data gaps? What processes were used for creating the current data? Are there any fees associated with the data? In what scale were the data created? What do the values in the tables mean? What software do I need in order to read the data? What projection is the data in? Can I give this data to someone else? Metadata contains information about a data set, in a standardized format, such that it can be understood and re-used.

5 Critical roles of data documentation
Data Use To know enough details about how the how the data were collected and stored Data Discovery To be able to identify important data sets Data Retrieval To know how and where to access data Data Archiving Data can grow more valuable with time, but only if the critical information required to retrieve and interpret the data remains available From: EML Best Practices for LTER Sites – Oct. 2004 Identification:----locate Minimum content for adequate data set discovery in a general cataloging system or repository title creator contact publisher pubDate keywords abstract (recommended) dataset/distribution (i.e. url for general dataset information) Discovery: Level 1 content, plus coverage information to support targeted searches, adding elements: Geographic Coverage Taxonomic Coverage Temporal Coverage Evaluation Level 2 content, plus data set details to enable end-user evaluation of the methodology and data entities, adding elements: Intellectual Rights project methods dataTable/entityGroup dataTable/attributes Access Level 3 content plus data access details to support automated data retrieval, adding elements: access physical Integration: Level 4 content plus complete attribute and quality control details to support computer-assisted data integration and re-sampling, adding elements: Attribute List (full descriptions) Constraint Quality Control

6 Elements of Documentation
Good data documentation answers these basic questions: How were the data produced /analyzed? Where was it collected (geographic location)? When were the data collected? When were they published? How should the data be cited? Why were the data created? What is the data about? What is the content of the data? The structure? Who created the data? Who maintains it? How were the data created? Good metadata answers a wide range of questions, including:

7 Documentation throughout your research
Variable or Item Level File or Dataset Level Project or Study Level Labels, codes, classifications Missing values (and how they are represented) Inventory of data files Relationship between those files Records, cases, etc. What the study set out to do; research questions How it contributes new knowledge to the field Methodologies used, instruments and measures UK Data Service MANTRA Project level: A complete academic thesis normally contains this information in detail, but a published article may not. If a dataset is shared, a detailed technical report will need to be included for the user to understand how the data were collected and processed. You should also provide a sample bibliographic citation to indicate how you would like secondary users of your data to cite it in any publications, etc. UK Data Service:


Download ppt "Data Management: Documentation & Metadata"

Similar presentations


Ads by Google