Presentation is loading. Please wait.

Presentation is loading. Please wait.

ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.

Similar presentations


Presentation on theme: "ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library."— Presentation transcript:

1 ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library

2 DATA AND DIGITAL PROJECTS

3 Nothing really important is ever headlined “Here is some data. Hope you find something interesting.” Annotation is critical. Editing is critical. -Amanda Cox, New York Times Graphics Editor http://www.slideshare.net/openjournalism/amanda-cox-visualizing-data-at-the-new-york-times

4 The importance of context http://www.nejm.org/doi/full/10.1056/NEJMp1402114?query=TOC&

5 Add value to your data Regardless of website functionality, annotation and guidance are important Look at other digital projects and test out similar sites to see what types of discovery and analysis activities you intuitively want to do with similar types of data Follow web design methodology by creating user stories and acting them out with mockups, wireframes, or simple spreadsheets

6 ORGANIZING AND STRUCTURING DATA

7 Metadata Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information. http://www.niso.org/publications/press/UnderstandingMetadata.pdf

8 http://archive.umw.edu:8080/vital/access/manager/Repository/umw:438

9 The metadata universe http://www.dlib.indiana.edu/~jenlrile/metadatamap/seeingstandards.pdf

10 Choosing metadata parameters Ask yourself why you are collecting the information you want to collect (perform a needs assessment) Focus on the outcomes and analysis tasks you want your site’s users to be able to perform with your data Depending on your audience, don’t assume users know what each metadata field or category means Choose data fields and parameters and do a few test runs before analyzing the data to determine if you need to add, change, or edit any of your fields

11 What is good metadata? According to Understanding Metadata by the National Information Standards Organization, good metadata… Should be appropriate to the materials in the collection, users of the collection, and intended, current, and likely use of the digital object Supports interoperability Uses standard controlled vocabularies to reflect the what, where, when, and who of the content Includes a clear statement on the conditions and terms of use for the digital object Should have the qualities of archivability, persistence, and unique identification, and should be authoritative and verifiable Supports the long-term management of objects in collections

12 Choose and use standard terminology Use a controlled vocabulary that provides preferred keywords and terminology for specific items Create a data dictionary and be consistent in applying it Example data dictionary entry: Dates are displayed in the yyyy-mm-dd format; i.e., March 15, 2015 would appear as 2015-03-15 Helps prevent inconsistencies in data entry and analysis Example: when "T", "temp", and "t" are all used interchangeably within a single dataset to refer to temperature measurements

13 Metadata dos Select your keywords wisely and think about the many ways someone might search for your data Use your data dictionary whenever possible to create keywords to establish a controlled vocabulary Use descriptive and clear writing Ensure that all data fields are independent and that they could exist on their own

14 Metadata don’ts Do not use jargon; define technical terms and acronyms and put them in your data dictionary Remember that a computer will read the information in the metadata record, so do not to use tabs, indents, or special characters like ! @ # % { } | / \ ~ that may be misunderstood Do not copy and paste content from word documents or other sources into your metadata record (use a text editor as a middle step to prevent unnecessary characters and errors being introduced)

15 Metadata is structured information Example Dublin Core record in XML

16 Data structure Structuring your data is important to ensure your site functions well and that the dataset can be used in a variety of ways Ways to structure data: For Excel or Google spreadsheets, save your data as CSV files in plain text format XML documents can be easily created through online data-entry forms and contain your metadata within a structured framework

17 Data best practices Make sure your data is portable Saved in an additional location outside your site in machine-readable, non-proprietary format Portable data is flexible, sharable, and can be harvested by a variety of tools for usage in future projects

18 Quality assurance and control Restrict what information can be entered into the dataset Limit the use of free text fields for metadata Use lookup tables or drop-down menus for data entry Use validation tools Do manual review Clean up and normalize messy data with tools like Open Refine

19 DATA MANAGEMENT

20 What is data management? Data management refers to all aspects of creating, housing, delivering, maintaining, archiving, and preserving data. A data management plan accounts for every activity within the data life cycle. https://www.dataone.org/best-practices

21 Contents of data management plan (DMP) Data Type and Format Data Storage Data Standards Data Security Data Sharing Long-term Access Check out VCU Libraries’ Research Data Management Guide at http://guides.library.vcu.edu/c.php?g=47977&p=300081http://guides.library.vcu.edu/c.php?g=47977&p=300081

22 Data citation and preservation Citation Dataset Citations should have (at a minimum): Creator (PublicationYear): Title. Publisher. Identifier. The Identifier could be a DOI or just the website’s URL Preservation Good documentation on data provenance (the origin and history of a dataset) is crucial. If data cannot be recreated or if it is costly to reproduce, it should be saved. Datasets that have significant long-term value may be contributed to a repository for preservation.

23 Data repositories These repositories can be used to find data for reuse or to deposit your research data for preservation and sharing:

24 Questions? Comments? Thank you! shuffman@umw.edu 540-654-1756 Please contact me if you need assistance with managing and organizing data in your research or teaching projects.

25 References and Resources http://www.slideshare.net/openjournalism/amanda-cox-visualizing- data-at-the-new-york-times http://www.slideshare.net/openjournalism/amanda-cox-visualizing- data-at-the-new-york-times http://www.nejm.org/doi/full/10.1056/NEJMp1402114?query=TOC& http://www.niso.org/publications/press/UnderstandingMetadata.pdf http://www.dlib.indiana.edu/~jenlrile/metadatamap/seeingstandards.p df http://www.dlib.indiana.edu/~jenlrile/metadatamap/seeingstandards.p df https://www.dataone.org/sites/all/documents/DataONE_BP_Primer_0 20212.pdf https://www.dataone.org/sites/all/documents/DataONE_BP_Primer_0 20212.pdf http://guides.library.vcu.edu/c.php?g=47977&p=300081


Download ppt "ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library."

Similar presentations


Ads by Google