Presentation is loading. Please wait.

Presentation is loading. Please wait.

Controlled Vocabulary Working Group - 2013 PRESENTED BY JOHN PORTER.

Similar presentations


Presentation on theme: "Controlled Vocabulary Working Group - 2013 PRESENTED BY JOHN PORTER."— Presentation transcript:

1 Controlled Vocabulary Working Group - 2013 PRESENTED BY JOHN PORTER

2 Goal  Make it easy for researchers to find the data they need from LTER repositories by  Enhancing searches through the use of a thesaurus that provides synonyms, narrower terms and related terms  Creating a browseable structure for locating datasets

3 2013 Goals  Enhance term list to incorporate:  New terms suggested by sites  Frequently searched terms  Frequently used terms  Terms related to human activities (social science)  More synonyms for existing terms that are found in LTER Metadata  Needed: Establish clear criteria for evaluating candidate terms  Best Practices

4 Goals  Add definitions for terms in the Controlled Vocabulary  Create plans for dealing with taxonomic names and places that are currently not part of the existing Controlled Vocabulary

5 Workshop – May 2013  Pre-Workshop  Queried LTER Sites for new candidate terms – Melendez, Henshaw, Vanderbilt  Queried existing documents for words not currently in the Controlled Vocabulary – Gastil-Buhl  Queried logs for search terms used by Metacat users - Costa  Updated Tematres software to the latest version - Porter  Identified online sources for definitions – O’Brien, Vanderbilt  Investigated taxonomic web services and gazetteers – Gries  Note: the group favors using Taxonomic and Geographic Coverage elements rather than keywords for these elements

6 Workshop Participants 2013  LTER Information Managers  Margaret O’Brien, Kristen Vanderbilt, Donald Henshaw and John Porter  Professional Librarians from UVA:  Sherry Lake and Ivey Glendon  Added a lot to our discussions  “about” vs. “contains” taxonomies  our focus is describing what datasets contain  “about” is much harder to define for data

7 Workshop Results 2013  New Terms  ~ 230 terms were suggested by 4 sites  ~ 75 terms were accepted and added to LTER Vocabulary  Reason for rejection was given for each term not added  ~ 25 additional terms were added based on use at 3 or more LTER Sites or 2 or more sites with > 10 datasets  ~ Several suggested terms were added as non-preferred (UF) terms  Definitions  309 new definitions added

8 Controlled Vocabulary Status  710 total preferred terms  200 synonyms (“use for” terms)  363 total definitions

9 Important Workshop Activities - 2013  Developed improved Best Practices for identifying additional terms for inclusion (http://im.lternet.edu/VocabBestPractices)http://im.lternet.edu/VocabBestPractices  Including a table that lays out grounds for rejecting particular words

10 WhatRationaleDo’s Problem Abbreviation Keywords should be applied to a number of datasets across the LTER Network. Data discovery is the goal, so keywords that find data are most useful. Propose keywords that are used at several other sites, and numerous datasets NR - not repeated in multiple datasets Keywords should be used at more than one site A goal is to enable cross-site searching Propose keywords that are used at several other sites A - absent from other sites Avoid proposing stand-alone adjectives Stand alone adjectives imply an “of what” question. Such as “aboveground” raises the question “aboveground what?” Propose nouns or possibly verbs, but not stand-alone adjectives. Perferred terms can include an adjective with an object (e.g., aboveground biomass) ADJ - stand-alone adjective Be specific Vague or ill-defined terms are hard to consistently assign Use specific, unambiguous and well- defined terms V - Vague Avoid duplicating concepts already in the Controlled Vocabulary Duplicative keywords lead to inconsistent keyword assignments Avoid duplication of nearly-equivalent terms AWE - adequate alternative word exists Keywords should be well-defined Without definition and context some technical terms may be difficult to assess or place Provide good definitions NC - needs clarification or better definition Proposed synonyms should have exact correspondence to the preferred term Synonyms should not refer to different concepts than the associated preferred term Select synonyms that are exact matches for the concept described by the preferred term NS - not a synonym Keywords should be terms that users frequently search on Keywords that are not searched for by users are not particularly useful. Propose keywords that are frequently used in searches NU - not used for search

11 Vision  Refining the “Vision” for how the controlled vocabulary can be used to make PASTA and other NIS elements more effective  And link to other efforts such as DataOne, LODE and EnvThes  Optional workshop yesterday – tasks identified:  Identify systems and software tools that effectively exploit controlled vocabularies for searching/browsing and ranking  Metrics tools: help identify specific datasets that could benefit from additional keywords

12 Help us out!  During discussions today and tomorrow, think about how the Controlled Vocabulary can be leveraged  Incorporate terms from the Controlled Vocabulary into your site EML documents  ASK us if you need help!!!!! – we have tools


Download ppt "Controlled Vocabulary Working Group - 2013 PRESENTED BY JOHN PORTER."

Similar presentations


Ads by Google